Hi,
I am pretty new to Kettle and have just started playing with it. I have a csv file that has EmpID, Status, and date fields(there are other fields that don't matter). I want to get only those rows in which "status" changes to ACTIV from DEACT for each "EmpID".
Eg. (Assuming the file has been sorted on "Date")
EmpID Status Date
1 DEACT 2010/05/09
2 DEACT 2011/06/11
3 DEACT 2011/06/11
4 ACTIV 2012/01/30
2 ACTIV 2012/03/05 *
2 ACTIV 2012/06/01
2 DEACT 2012/06/02
3 DEACT 2012/07/01
4 ACTIV 2012/07/10 *
1 DEACT 2012/07/30
2 ACTIV 2012/07/30 *
The desired output would be rows marked with *s. Any suggestions will be much appreciated.
Thanks in advance.


Reply With Quote