Hitachi Vantara Pentaho Community Forums
Results 1 to 9 of 9

Thread: Pass data between trasformation in job view

  1. #1

    Default Pass data between trasformation in job view

    Hi! I am trying to pass data between trasformation in job view; in few words I have 2 trasformation step, the first one that read from a file, make some stuff and write result to a table; the second one that read from that table, make some stuff, and write result to another table. this way is not efficient so i want structure my operations in this way:

    step1: read from a file, make some stuff, send data to step 2.
    step2: receive data from step1, make some stuff, write data to table

    (obviously i can't merge theese steps because they can't run in parallel)

    How I can make it in this way?

    thanks
    Last edited by metalmilitia; 11-11-2008 at 11:08 AM. Reason: clarify

  2. Default

    Looks like you understand the limitations of transformations well.

    I'm not sure what improved performance you are expecting. If the transformation one has to complete its work before transformation two can start, there is only the work of saving the data from step one and reading it in step two that can be improved upon.

    I think you might have already chosen the best way to do that, by saving it into the database. That way you have large overflow (like a memory structure could burst for larger data sets) and also buffers, as the database should keep smaller sets of data buffered in memory.

    If possible I'd optimize the connection to the database for the temp table between the transformations (run it on the same machine as the kettle).

    K<o>

  3. #3
    Join Date
    Jul 2007
    Posts
    1,013

    Default

    If you really can't restructrure your transformations for parallelism, you could use a "Blocking Step" to hold all rows until all previous steps have finished processing. You will have trouble if you have many records though...

  4. #4

    Default

    Quote Originally Posted by PlanBForOpenOffice View Post
    Looks like you understand the limitations of transformations well.

    I'm not sure what improved performance you are expecting. If the transformation one has to complete its work before transformation two can start, there is only the work of saving the data from step one and reading it in step two that can be improved upon.
    the problem is that i have about 10 step in sequence that read and write the same table (to semplify i had talk only about 2 step) :-D

  5. #5

    Default

    Quote Originally Posted by metalmilitia View Post

    step1: read from a file, make some stuff, send data to step 2.
    step2: receive data from step1, make some stuff, write data to table
    Have you †ried:

    step1: read from a file, make some stuff,copy rows to result
    step2: gets rows from result, make some stuff, write data to table

    It worked really well for me at many occasions.

    Al.

  6. #6

    Default

    Quote Originally Posted by acbonnemaison View Post
    Have you †ried:

    step1: read from a file, make some stuff,copy rows to result
    step2: gets rows from result, make some stuff, write data to table

    It worked really well for me at many occasions.

    Al.
    this is the first attempt that I done but something for me goes wrong...

    in copy rows to result i have selected all field using the fetch fields button

    in gets rows from result i have copy (ctrl-c ctrl-v) all field from copy rows to result

    but still doesn't work... probably have I to select some options?

    thanks all folks for answering!

    :-D

  7. #7

    Default

    Quote Originally Posted by metalmilitia View Post
    ...but still doesn't work...
    :-D
    Ummm...the Copy Rows/Get Rows steps are pretty simple to use (I am on 3.1 GA).

    When you write "does not work", what do you mean? what error message do you get?

  8. #8

    Default

    Quote Originally Posted by acbonnemaison View Post
    Ummm...the Copy Rows/Get Rows steps are pretty simple to use (I am on 3.1 GA).

    When you write "does not work", what do you mean? what error message do you get?
    in step2 after get rows there is a javascript that use the result but it goes in error because ("<a-variable> is not define" where <a-variable> is a variable that i have defined)

  9. #9

    Default

    it works :-D I have tried to make a simplier example and it works :-D



    thanks!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.