Hitachi Vantara Pentaho Community Forums
Results 1 to 9 of 9

Thread: Forcing exection order of sub-transformations

  1. #1
    Join Date
    Mar 2007
    Posts
    9

    Default Forcing exection order of sub-transformations

    I have a transformation, which contains several sub-transformations in a sequence. For example, Table Input-->Sub Transformation 1-->Sub Transformation 2-->Sub Transformation 3-->Table Output. I want Sub Transformations 2 to start only after Sub Transformation 1 completes processing all the rows. Similarly, I want Sub Transformation 3 to start only after Sub Transformation 2 completes processing all the rows. I tried various techniques using "Block until steps finish", "Blocking Step" etc, but in vain. How do I achieve this in PDI 5.0? Is this possible at all in PDI?

  2. #2
    Join Date
    Apr 2008
    Posts
    1,771

    Default

    It's better if you use a JOB and connect those transformations within the same job.
    -- Mick --

  3. #3
    Join Date
    Mar 2007
    Posts
    9

    Default

    Yes thought about using Job calling Transformations instead of using Transformation calling Sub-Transformations. There are few issues that forced me into using the Transformation calling Sub-Transformations approach. 1) I need the entire transformation to be in a single database transaction. I could not make entire Job transactional, since that feature is not available in Community Edition of PDI 5.0. 2) Order of execution of transformation is important for me. So, I made my transformations into sub-transformations to be called from the main transformation in a sequence. And this way, my entire transformation will be in one transaction. Now the problem is forcing the execution of sub-transformations.

  4. #4
    Join Date
    Jul 2009
    Posts
    476

    Default

    This is a hack, but if you don't have a large number of rows going through your transformation, you could put "Sort rows" steps between your sub-transformations. "Sort rows" will wait until all of your rows finish the prior step, so the next sub-transformation won't start until the previous one finished. However, if you have a large number of rows, this could slow down your processing significantly.

  5. #5
    Join Date
    Apr 2007
    Posts
    2,010

    Default

    dont use sort rows - use the blocking step to achieve this ( altho it still spools to disk so will still be slow )
    Question is why didnt blocking step do what you want? its definately the step to use.

  6. #6
    Join Date
    Mar 2007
    Posts
    9

    Default

    Thanks for all the replies. I think Blocking Step is the right solution. I came across the same solution suggested in "Pentaho Kettle Solutions" book also, though in the context of a simple step (not specifically mentioning about Sub-Transformation). It seems to work for the simple use cases I tried with Sub-Transformation. Still I am not sure if a Sub Transformation 2 has a "Input" step, such as "Table Input", whether this Table Input step also will wait until all the rows are processed in the previous step (Sub Transformation 1). This I could not verify. Any pointers?

  7. #7
    Join Date
    Apr 2008
    Posts
    1,771

    Default

    If you have 2 transformations in sequence, maybe you can use "Copy rows" and "Get rows from result"?
    -- Mick --

  8. #8
    Join Date
    Jul 2009
    Posts
    476

    Default

    Quote Originally Posted by rameshsr View Post
    Still I am not sure if a Sub Transformation 2 has a "Input" step, such as "Table Input", whether this Table Input step also will wait until all the rows are processed in the previous step (Sub Transformation 1). This I could not verify. Any pointers?
    A "sub-transformation" in PDI is called from the main transformation in a "Mapping (sub-transformation)" step, which is located under the Mapping folder in the Design tab. The sub-transformation has a "Mapping input specification" step at the beginning, where the rows are received from the main transformation, and a "Mapping output specification" step at the end, where rows are sent back to the main transformation.

    When you say you want to use "sub-transformations," it's not clear whether you need PDI sub-transformations, or if you really want to create a PDI job that calls several independent transformations. If your second "sub-transformation" uses a Table Input step as its row source, then you probably should create a PDI job and separate the transformations. If your two "sub-transformations" operate on exactly the same pipeline of rows, then you might string them together as sub-transformations in a single main transformation.

    There is documentation for the steps here: http://wiki.pentaho.com/display/EAI/...egration+Steps

    There are examples of many different kinds of steps and transformations under the "samples" folder of your PDI installation.

    (You may already know about these things. I see that you joined on this forum in March 2007, but you've only posted here 5 times, so I'm including this info in case it's helpful.)

  9. #9
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Quote Originally Posted by rameshsr View Post
    Still I am not sure if a Sub Transformation 2 has a "Input" step, such as "Table Input", whether this Table Input step also will wait until all the rows are processed in the previous step (Sub Transformation 1). This I could not verify. Any pointers?
    I'm not 100% certain on your last question, but my understanding is that it will execute as soon as it can, even though it's in a Mapping block.

    This is true of putting all steps in one transformation and using blocking steps to control flow as well.

    Question: What happens if you build it out using staging tables instead, and then your last step in the job is "Insert into ProdTable VALUES select * from StagingTable" for each staging table. You could easily wrap that in a transaction, and could likely even build better recovery steps that way.
    **THIS IS A SIGNATURE - IT GETS POSTED ON (ALMOST) EVERY POST**
    I'm no expert.
    Take my comments at your own risk.

    PDI user since PDI 3.1
    PDI on Windows 7 & Linux

    Please keep in mind (and this may not apply to this thread):
    No forum member is going to do your work for you. We will help you sort out how to do a specific part of the work, as best we can, in the timelines that our work will allow us.
    Signature Updated: 2014-06-30

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.