Hitachi Vantara Pentaho Community Forums
Results 1 to 6 of 6

Thread: Denormalize Using Repeating Headers

  1. #1
    Join Date
    Nov 2007
    Posts
    11

    Default Denormalize Using Repeating Headers

    Hey all, I've searched in vain for a similar post but didn't see anything. Here's my scenario:

    input file:

    pass plays.attempted.received.yards.td
    smith......4.........3........38....1
    johnson....9.........5........45....0
    run plays.yardage.td
    lewis.....15......0
    roberts...44......2
    douglas...80......3
    receive plays.catches.yards.td
    harper........7.......56....1
    taylor........3.......45....1


    As you can see, the data is almost like XML or copybook, but doesn't have tags.

    Ideally the output might be like this:

    play.....player..attempted.received.yards.td
    pass.....smith...4.........3........38....1
    pass.....johnson.9.........5........45....0
    run......lewis......................15....0
    run......roberts....................44....2
    run......douglas....................80....3
    receive..harper............7........56....1
    receive..taylor............3........45....1


    Short of some pretty convoluted variable logic, I couldn't come up with a good way. Anyone have any ideas?

    Thanks in advance for any input!

    Ben.

  2. #2
    Join Date
    May 2006
    Posts
    4,882

    Default

    I would opt for some pre-processing outside of Kettle

    Regards,
    Sven

  3. #3
    Join Date
    Nov 2007
    Posts
    11

    Default

    Is there a way to pass a variable from row to row?

    Something like:

    1) check row to see if starts with pass run or receive
    2) if so then pass this on to the next row
    3) keep doing so until a new pass, run, or receive is reached and then pass this on, etc, etc

    If I could get this from row to row I could make it work. . .


    Thanks,
    Ben.

  4. #4
    Join Date
    May 2006
    Posts
    4,882

    Default

    With Javascript you can do stuff like that but then you can only read in textinput as one large input record per line, ...

    It would end up as a very brittle and error-prone job.

    Regards,
    Sven

  5. #5

    Default

    looks like fixed-width processing to me -

    pass plays.attempted.received.yards.td
    smith......4.........3........38....1
    johnson....9.........5........45....0

    11|10|9|6|1

    But, that's assuming these are separate files and not all one big file, which it appears to be one big file.

    If one big file, what Sven said: one-line-input-string with Javascript parsing is pretty much it.

  6. #6
    Join Date
    Nov 2007
    Posts
    11

    Default

    Yes, you are correct, it is one file with all repeating groups.

    I have a solution crafted but can't quite get it to function correctly.

    In this solution I'm trying to pass a variable from one row to the next but it's not passing as I expected.

    The logic is (attached also):

    Bring the environment variable xyz into field hold_value

    Pass hold_value into a javascript transform

    Perform logic to test if a certain field is null

    if null I want to reset the env variable to a value to be used in the next row

    if not null use the last hold_value to set an output field

    I guess I'm not understanding the environment variable correct as it is not passing to the next row. I can load the variable in the transform start screen, but it is not resetting at the test in order to pass to the next row.

    Any input is much appreciated as I'm banging my head against the monitor at the moment.

    Thanks,
    Ben.
    Attached Files Attached Files
    Last edited by bisenhour; 12-24-2007 at 06:30 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.