Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: Re-engineering historical data

  1. #1
    Join Date
    Feb 2008

    Default Re-engineering historical data

    I have a situation where I need to rebuild the history of unit costs associated with an item. I have been able to get the data to the point where I have the item grouped by the various unit costs and associated min and max dates where that unit cost was used in the data.

    What I am trying to do is build a dimension table of sorts out of this data so that I can assign the correct unit cost to an item based on the date of the transaction. My data isn't necessarily complete so the max date of one unit cost record often doesn't equal the min date of the next unit cost record - 1.

    I was looking for an example where I could get the value of a field from the "next" row in my set to determine if:

    1) the item value was the same
    2) determine which date to use for the starting date for that next item based on the end date of the last item and/or the min date of the next item.

    I've seen the example for adding rows and I was trying to see if I could use that as a template or if there is a better set of java script commands that will get me there.

    Can anyone steer me in the right direction for doing this?

    Here's an example of what the data looks like. Sometimes there is overlap in the dates as in this example...

    item    unit_cost    start_unit_cost    end_unit_cost       
    000380-001    3192.16391296    11/18/2003 12:00:00.000 AM    12/31/2003 12:00:00.000 AM       
    000380-001    2993.86777429    1/7/2004 12:00:00.000 AM    2/28/2004 12:00:00.000 AM       
    000380-001    2994.22185109    1/8/2004 12:00:00.000 AM    12/31/2004 12:00:00.000 AM       
    000380-001    3151.38975110    1/6/2005 12:00:00.000 AM    11/14/2005 12:00:00.000 AM       
    000380-001    3240.09037207    12/7/2005 12:00:00.000 AM    12/20/2006 12:00:00.000 AM       
    000380-001    3307.96897625    1/17/2007 12:00:00.000 AM    12/11/2007 12:00:00.000 AM       
    000380-001    3307.45236065    1/2/2008 12:00:00.000 AM    10/27/2008 12:00:00.000 AM       
    000380-001    3483.68376448    1/7/2009 12:00:00.000 AM    4/27/2009 12:00:00.000 AM
    I need to make it so there are no gaps in dates between unit cost records.

  2. #2
    Join Date
    Feb 2009


    Check out the Analytics step in PDI 3.2.0 - it can get data from lines ahead and after the current one.
    doing ETL with his hands bound on his back

  3. #3
    Join Date
    Feb 2008

    Default Still using 3.1.2

    I'm not ready to upgrade to 3.2.0 until it is GA. Is there a java script that I can use instead?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.