Hitachi Vantara Pentaho Community Forums
Results 1 to 12 of 12

Thread: Variables

  1. #1

    Default Variables

    Hi I need to do following operation and was wondering if there is a solution in kettle.

    step 1
    select ${dynamic_field} from ${dynamic_table}

    ---->

    step 2
    rename ${dynamic_field} to ${dynamic_variable2}


    --->

    step 3
    insert/update ${dynamic_variable2} to ${dynamic_table2}



    Step 1 is not a problem in Kettle and I could set the variable for table in insert update also. But I don't see any way to use variables in select and insert/update steps.

    Renaming row elements in javascript is also looking difficult. Javascript step is badly documented and does not describe the available functions.

    Any help would be greatly appreciated.

    -am
    Last edited by ashutosh_malaviya; 08-15-2008 at 08:32 PM.

  2. #2
    Join Date
    May 2006
    Posts
    4,882

    Default

    With 1 you will get away... 2 you get away with some javascript fiddling... forget 3, not possible, no short term plans that I'm aware of building it in (and personally I'm pretty much against it).

    In any case such solutions always seem so generic to me... what's your case for "needing" such flexibility.

    Regards,
    Sven

  3. #3

    Default

    I have 1200 columns those have to be transformed into 400 output columns and i should write all this manually with 400 transformations. And they are so repetitive.

    So i am trying to solve a general transformation which takes inputs from the high level job. i am looking at talend and oracle tools for alternate solutions. i thought kettle is flexible but apparently not enough.

    I am solving this with "sql script execution step" which looks like a moot thing to do. I can do that outside also with java, ruby or perl.
    Last edited by ashutosh_malaviya; 08-17-2008 at 02:53 PM.

  4. #4
    Join Date
    May 2006
    Posts
    4,882

    Default

    I mentioned it before, and will probably in the future again... PDI is not a golden hammer, and not all things out there are nails

    For your particular problem, I doubt you will find solutions in any packaged tool. Mind-reading is not yet a standard feature on ETL-tools, even if everything is variable sensitive in PDI you would still need to fill those variables upfront.

    If it's really repetitive, the closest you will probably come (outside of PDI) is something to generate scripts (in e.g. SQL*Plus, ...) but then you're weighing the cost of doing it manually against development cost.

    Regards,
    Sven

    BTW I don't buy your numbers ... 1400 transformations for 1200 columns... you have very strange transformations.

  5. #5

    Default

    I finally wrote a ruby program to create kettle ETL transformations.
    I am only bemused by your funny answers and comments.
    It reminds me of Seinfeld's soup-Nazi. You cant get "flexibility", live with it.

    Anyway, thanks for your efforts. Overall Kettle is a good tool and with its API lot of things can be done.

    best regards,
    -ashutosh

  6. #6

    Default repetitive nature of transformations

    This should demonstrate the need for variables. And it is not a very special need.

    Actual transformation statistics:

    stage 1:
    6000 columns to 1200 columns
    10 million rows
    1200 transformations
    unique type of transformations = 5

    1200 statistical transformations
    unique type of transformations = 1

    stage 2:

    1200 columns to 400 columns
    400 transformations
    unique type of kettle transformations = 29
    400 statistical transformations
    unique type of transformations =1

    summary

    Overall number of transformations = 3200
    unique type of transformations = 36

    So, right now i am doing this with ruby's regular expression and xml processing. These transformations are very slow in execution that is my next problem to solve with your help :-).

  7. #7
    Join Date
    May 2006
    Posts
    4,882

    Default

    No soup for you ... seriously... you can't do 100% of the cases you can imagine in any tool. Yours is a pretty special case, it would be pretty difficult to implement "variable sensitivity" in the insert/update step... and 99.9% of the other people wouldn't be interested in it.

    Regards,
    Sven

  8. #8
    DEinspanjer Guest

    Default

    I'd be happy to try to help you figure out a better way of doing it, but I think I'd need a more concrete example / test case.

    Are these 6000 column names changing regularly? What sort of aggregations are you doing?

    Do you think you could take this beginning of a testcase with just 26 columns and give me some ideas of how to set up your problem space with it so we can see if there is a better way to solve it?
    Attached Files Attached Files

  9. #9

    Default

    Thanks, sounds good. I will submit a problem case with 26 columns by tomorrow. If we can reuse just one transformation then we have a solution.

    regards,
    -ashutosh

  10. #10
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    And Sven, just for reference, "Soup Nazi" is my nickname, I want it back.

  11. #11
    Join Date
    May 2006
    Posts
    4,882

    Default

    Lol ... I did not claim it , you can have it back ... apparently we're "inflexible" people

  12. #12
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    apparently we're "inflexible" people
    In the contexts of feature creep, project management, etc, that is actually something to be proud of.

    Cheers,
    Matt

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.