US and Worldwide: +1 (866) 660-7555

1. Junior Member
Join Date
Dec 2010
Posts
7

## Reverse Stream Step?

Hi there Kettle users,

do you know if there is a step that simply reverses the stream. E.g. the rows A, B, C get reversed to C, B, A? I know that this is doable with sort, but that would be a waste of resources.

What I try to solve is to fill gaps (null values) of a total amount fact in a timeline with the values from the last predecessor or successor date entry, depending on which is bigger.

product | amount | date
A | 4 | 01.01.2012
A | | 02.01.2012
A | | 03.01.2012
A | 5 | 04.01.2012
B | 7 | 01.01.2012
B | | 02.01.2012
B | 6 | 03.01.2012

Shall become
product | amount | date | predecessor | successor
A | 4 | 01.01.2012 | 4 | 4
A | | 02.01.2012 | 4 | 5
A | | 03.01.2012 | 4 | 5
A | 5 | 04.01.2012 | 5 | 5
B | 7 | 01.01.2012 | 7 | 7
B | | 02.01.2012 | 7 | 6
B | 6 | 03.01.2012 | 6 | 6

Until now I am filling the predecessor with a run in date ASC direction, than I sort date DESC and fill the successor. Since the data is already sorted I'd like to replace the second sort by a stream reverse.

Kind regards,

Bergtroll

2. Senior Member
Join Date
Apr 2008
Posts
802
Hi.
I think that the Statistics step could do what you are looking for.
Actually, it's called Analytic Query:
http://wiki.pentaho.com/display/EAI/Analytic+Query

Mick
Last edited by Mick_data; 08-08-2012 at 07:54 AM. Reason: Typo

3. Senior Member
Join Date
Jun 2012
Posts
1,471
If the rows are coming from a database, the last (or next) known value can be obtained by the SELECT statement:

Code:
```SELECT
date,
product,
amount,
( SELECT amount FROM sales WHERE product = s.product AND amount IS NOT NULL AND date > s.date ORDER BY date LIMIT 1) AS next_amount,
( SELECT amount FROM sales WHERE product = s.product AND amount IS NOT NULL AND date < s.date ORDER BY date DESC LIMIT 1) AS prev_amount
FROM sales s```
This way it doesn't matter, if the values for consecutive dates are missing.

Originally Posted by Bergtroll
What I try to solve is to fill gaps (null values) of a total amount fact in a timeline with the values from the last predecessor or successor date entry, depending on which is bigger.
And you still call it a fact?

4. Junior Member
Join Date
Dec 2010
Posts
7
Originally Posted by marabu
If the rows are coming from a database, the last (or next) known value can be obtained by the SELECT statement:
Thanks you for this, I somehow did not even thought about solving it in SQL, I looks like it should work.

Originally Posted by marabu
And you still call it a fact?
I almost expected this question . Yes I still call it a fact, because I do not use the result to modify and fill gaps in the facts table itself. Only original data stays in there. The query implements the payment business rules of my customer and is the base for reporting and subsequent invoicing.

Kind regards,

Marius

5. Chief Data Integration
Join Date
Nov 1999
Posts
9,535
As to the original question, both the Excel Input as well as the Text File Input step support the "Repeat" option in their fields section. Enabling this option will repeat the last value if the current one is null.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•