Hitachi Vantara Pentaho Community Forums
Results 1 to 8 of 8

Thread: How to use same transformation with different parameters, then append streams?

  1. #1

    Question [SOLVED] How to use same transformation with diff. parameters, then append streams?

    For a BI project, I'm trying to pull data from several independent source databases and merge the data into one single target database. The source databases are structurally identical, only their content is different.

    Currently, I use ONE transformation which performs all steps in one shot (I attached this file * AppendingTransformations.pdf * with a screenshot on page 1 "Current Situation"). I set up the transformation for one database (accessing multiple tables with their corresponding connection settings). Then I made a copy of the operators, and changed the connection settings so they refer to another database (i. e. the settings for the "Table input" operator).

    This is already a bit cumbersome, but the worst part is this: The project is "work in progress", and whenever I need to make a change (i. e. add a calculation, or pull another field), I need to make that change for each database from where I'm pulling, which is very hard to maintain.

    To solve that, I'm trying to achieve this (see the drawing (not working) on page 2 in the pdf, "Goal"): I'd like to create one and the same transformation (not copies) each time I need to access a different database, and only change the paramaters (variables?) for the connection settings, then append the different streams, and write it into the target database.

    This way I would only need to maintain one version of the transformation, and any change would be reflected in all instances of that transformation.

    Would somebody be able to put me on the right track?

    Thanks a lot in advance!
    Last edited by MacPhotoBiker; 01-08-2018 at 11:58 AM. Reason: Marked as "solved"

  2. #2
    Join Date
    Apr 2008


    You're absolutely on the right track!

    Build ONE transformation (either DB1 or DB3) but in the transformation properties (Press Ctrl-T or double click on the canvas) define parameters such as DB_HOST, DB_USER, DB_PASSWORD. Put the corresponding variables into your DB configuration screen: Parameter DB_HOST -> in the Host Name field of the connection screen, put ${DB_HOST} (Note: you can put the password in the password spot as ${DB_PASSWORD} even though you can't read it!)
    Once you've done all that, test this transformation using the GUI, and supplying the another DB host, user, and password in the parameter values on the Run Options screen.

    When that's working, set up another transformation, with a Data Grid, connected to a transformation executor. In the data grid, list all your hosts, with the correct user and passwords, then on the configuration of the transformation executor, map those columns to the parameters defined. The process will now run sequentially through your databases when you run the DG->TE transform. I have heard (I've never tested it!) that you can set "Number of copies to execute" on the TE step and have multiple copies running at once -- just make sure that the data is distributed not copied from the DG to the TE. If the data is copied, then you'll run each database multiple times.
    Last edited by gutlez; 01-05-2018 at 02:18 PM.

  3. #3


    Hi gutlez,

    Thank you very much for your reply! Let me start with a "confession": Once I posted my question here (my first question in this forum), I realized that there's a new forum, so I posted it there ( Sorry about that, I did not mean to double post.

    Are you using the new forum as well? If so, would you mind if I paste your reply there?

    I can't work on it right now, but it looks very helpful, and I'll certainly check that out over the weekend.

    Thank you very much, I really appreciate it.


  4. #4
    Join Date
    Jun 2012


    You could just add a link to point here ...
    So long, and thanks for all the fish.

  5. #5


    Hi marabu,

    Thanks for the suggestion. Since the forum here is the old one, I thought it makes sense to keep the communication together in the NEW forum, so readers there would not need to go somewhere else to get the entire thread. But nonetheless, a friendly user had already provided a link there to the post here

  6. #6


    Hi gutlez,

    Thank you very much again for your excellent description, it works like a charm! And the "trick" with the data grid will save me a tremendouns amount of work, so truly, thank you very much!

    May I ask you a related question? Does the Transformation Executor have to be a final step, or can it be part of a transformation? What I'd like to achieve is this: Let's say I'm iterating through 5 databases, pulling for example an address field for thousands of clients and appending them into one long list. Currently I'm able to write those appended results (e. g. to an Excel file) as long as the "Get Table" and the output operator are part of the same transformation. However, instead of writing it somewhere, I'd like to keep processing the data (for example handling null values). Is it possible to use a TE just a s a regular operator? I'm able to connect other operators, but I'm not getting any output from the TE.

    Thanks again!

  7. #7
    Join Date
    Jun 2012


    Doesn't the Wiki page on the Transformation-Executor answer your question?

  8. #8


    Hi marabu,

    Thank you very much for providing the link. Indeed, that answered my question, all I needed to do is to add the "Copy rows to result" operator. (I had read the Wiki page though, but I hadn't connected the dots).

    Much appreciated!

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.