US and Worldwide: +1 (866) 660-7555
Results 1 to 5 of 5

Thread: Reverse Stream Step?

  1. #1
    Join Date
    Dec 2010
    Posts
    7

    Default Reverse Stream Step?

    Hi there Kettle users,

    do you know if there is a step that simply reverses the stream. E.g. the rows A, B, C get reversed to C, B, A? I know that this is doable with sort, but that would be a waste of resources.

    What I try to solve is to fill gaps (null values) of a total amount fact in a timeline with the values from the last predecessor or successor date entry, depending on which is bigger.

    product | amount | date
    A | 4 | 01.01.2012
    A | | 02.01.2012
    A | | 03.01.2012
    A | 5 | 04.01.2012
    B | 7 | 01.01.2012
    B | | 02.01.2012
    B | 6 | 03.01.2012

    Shall become
    product | amount | date | predecessor | successor
    A | 4 | 01.01.2012 | 4 | 4
    A | | 02.01.2012 | 4 | 5
    A | | 03.01.2012 | 4 | 5
    A | 5 | 04.01.2012 | 5 | 5
    B | 7 | 01.01.2012 | 7 | 7
    B | | 02.01.2012 | 7 | 6
    B | 6 | 03.01.2012 | 6 | 6

    Until now I am filling the predecessor with a run in date ASC direction, than I sort date DESC and fill the successor. Since the data is already sorted I'd like to replace the second sort by a stream reverse.

    Kind regards,

    Bergtroll

  2. #2
    Join Date
    Apr 2008
    Posts
    802

    Default

    Hi.
    I think that the Statistics step could do what you are looking for.
    Actually, it's called Analytic Query:
    http://wiki.pentaho.com/display/EAI/Analytic+Query

    Mick
    Last edited by Mick_data; 08-08-2012 at 07:54 AM. Reason: Typo

  3. #3
    Join Date
    Jun 2012
    Posts
    1,471

    Default

    If the rows are coming from a database, the last (or next) known value can be obtained by the SELECT statement:

    Code:
    SELECT 
      date,
      product, 
      amount, 
    ( SELECT amount FROM sales WHERE product = s.product AND amount IS NOT NULL AND date > s.date ORDER BY date LIMIT 1) AS next_amount,
    ( SELECT amount FROM sales WHERE product = s.product AND amount IS NOT NULL AND date < s.date ORDER BY date DESC LIMIT 1) AS prev_amount
    FROM sales s
    This way it doesn't matter, if the values for consecutive dates are missing.


    Quote Originally Posted by Bergtroll View Post
    What I try to solve is to fill gaps (null values) of a total amount fact in a timeline with the values from the last predecessor or successor date entry, depending on which is bigger.
    And you still call it a fact?
    pdi-ce-4.3.0-stable
    OpenJDK IcedTea 2.3.7 (7u21)
    ubuntu 12.04 LTS (x86_64)

  4. #4
    Join Date
    Dec 2010
    Posts
    7

    Default

    Quote Originally Posted by marabu View Post
    If the rows are coming from a database, the last (or next) known value can be obtained by the SELECT statement:
    Thanks you for this, I somehow did not even thought about solving it in SQL, I looks like it should work.

    Quote Originally Posted by marabu View Post
    And you still call it a fact?
    I almost expected this question . Yes I still call it a fact, because I do not use the result to modify and fill gaps in the facts table itself. Only original data stays in there. The query implements the payment business rules of my customer and is the base for reporting and subsequent invoicing.

    Kind regards,

    Marius

  5. #5
    Join Date
    Nov 1999
    Posts
    9,535

    Default

    As to the original question, both the Excel Input as well as the Text File Input step support the "Repeat" option in their fields section. Enabling this option will repeat the last value if the current one is null.
    Matt Casters, Chief Data Integration
    Pentaho, Open Source Business Intelligence
    http://www.pentaho.org -- mcasters@pentaho.org

    Author of the book Pentaho Kettle Solutions by Wiley. Also available as e-Book and on the Kindle reading applications (iPhone, iPad, Android, Kindle devices, ...)

    Join us on IRC server Freenode.net, channel ##pentaho

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •