Hitachi Vantara Pentaho Community Forums
Page 1 of 2 12 LastLast
Results 1 to 10 of 14

Thread: Dynamic transaction process - is it possible?

  1. #1
    Join Date
    Aug 2008
    Posts
    563

    Default Dynamic transaction process - is it possible?

    Hello,
    I have built one transaction that basically reads 3 variables from a MySQL table, imports them, then some data manipulation follows and in the end the data is exported to another database.

    So, I want to duplicate this transaction for 20+ other tables without

    • having 20+ duplicates of the same transaction logic
    • creating a stored procedure on the database
    • using a union SQL statement on import


    Basically, I would like to have a lookup table with all the table names and then basically have loop that runs through it and creates the transactions based on a transaction template (this is what I am currently doing with PHP and MySQL). So basically, all that would change is the name of the importet table for each transaction. Can I do this in Kettle as well somehow?

    Also, the actual column names of the variables change slightly from table to table as well, i.e. (simplified):
    TableName, Custumerid
    table1, custid
    table2, userid
    table3, customerid
    ...

    So, actually, I am getting the same data from all the tables, just the actual column names are different (and the table name). So right now I have all this in a lookup table, and as said, a PHP loops through it and generates the SQL.

    How can I do this in Kettle?
    (I hope I described everything properly, if not, please let me know).
    Thanks,
    Diddy

  2. #2
    Join Date
    Apr 2008
    Posts
    4,690

    Default

    Try looking through the sample in $KETTLE_HOME/samples/process all tables/process all tables.kjb

    It does something similar to what you want to do.

    As long as you can define the "template transformation" as something that is common to all the tables, it should be possible to do.

  3. #3
    Join Date
    Aug 2008
    Posts
    563

    Default Got it working, but ...

    Hi. Thanks a lot for your answer! I got nearly everything working. But, in my case,

    1. I have to create the output table if it doesn't exist
    2. If the table already exists, I have to delete its contents.


    So in a non dynamic transformation I would just press the SQL button in the output dialog and copy the SQL and run it to create the table. But I don't want to do this manually each time.

    So in my dynamic version, it should do the following automatically:

    1. if the table doesn't exist, it should create it
    2. if the table already exists, it should delete all rows of the table (I think this works if I check the 'truncate table' option).

    So, my question is, how do I implement (1)? I tried to but an 'execute SQL script' step before the output, but it didn't work. then I moved it at the beginning of the transformation, and it didn't work as well. Do you have any suggestions?

    # This what I did with the SQL exection step:
    - I put the step after getting the table name variable (so before the 'process on table' step) in the 'process one table.kjb'. So the run time variable $(TABLENAME) should be availble.
    - The SQL for to SQL execution step looks like this below, but it didn't work:
    DROP TABLE IF EXISTS $(TABLENAME);
    CREATE TABLE $(TABLENAME)
    (
    date DATETIME,
    userid VARCHAR(45),
    operator VARCHAR(45),
    day_of_month INT(35),
    day_of_year INT(35),
    the_month INT(2),
    the_week INT(35),
    the_year INT(35)
    )
    ;

    I also tried to specify $(TABLENAME) as 'fieldname to be used as argument' and then changed the query (see below), but it didn't work either.
    DROP TABLE IF EXISTS ?;
    CREATE TABLE ?
    (
    date DATETIME,
    userid VARCHAR(45),
    operator VARCHAR(45),
    day_of_month INT(35),
    day_of_year INT(35),
    the_month INT(2),
    the_week INT(35),
    the_year INT(35)
    )
    ;

    Thanks a lot,
    Diddy
    Last edited by diddy; 03-12-2009 at 12:06 PM.

  4. #4
    Join Date
    Aug 2008
    Posts
    563

    Default Now it's working

    Ok, now I got it finally working . I forgot to switch on "variable substitution" in the 'execute SQL script' step.

  5. #5
    Join Date
    May 2006
    Posts
    4,882

    Default

    It's not supposed to be working the way you use it... so be careful ... behaviour may change

  6. #6
    Join Date
    Aug 2008
    Posts
    563

    Default What's the right thing to do then?

    Hi. Thanks for your answer! I was wondering what the right thing to do is then.

    I couldn't find a step that would create a table automatically. I would be nice to have this functionality in the export to database table step.

    Can you please point me to the right direction?

    Thanks a lot,
    Diddy

  7. #7
    Join Date
    May 2006
    Posts
    4,882

    Default

    Somewhere in JIRA there's a request for a real DDL step. Exec SQL step currently executes DDL, but that's just because the underlying JDBC allows it (probably it won't change, short term anyway).

    Regards,
    Sven

  8. #8
    Join Date
    Aug 2008
    Posts
    563

    Default

    Ok, thanks a lot Sven!

  9. #9
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Sven and I together with the other developers refuse to give that information. (seriously) Too many people would shoot themselves in the foot. Generating the SQL itself is obviously very simple for us, adding a check-box would be trivial, but we simply don't do it.

    You could probably write around it in JavaScript, but again, it's a bad idea.

  10. #10
    Join Date
    Aug 2008
    Posts
    563

    Default

    Quote Originally Posted by MattCasters View Post
    Sven and I together with the other developers refuse to give that information. (seriously) Too many people would shoot themselves in the foot. Generating the SQL itself is obviously very simple for us, adding a check-box would be trivial, but we simply don't do it.

    You could probably write around it in JavaScript, but again, it's a bad idea.
    Ok, I am completely new to using an ETL tool. I have so far worked with PHP/MySQL or used SAS and SAS Macro, so I probably quite lack the detailed understanding of this ETL tool.
    I wanted to just replicate in Kettle what I wrote before in PHP/MySQL or SAS/SASmacro.
    I can understand that dropping tables can cause serious issues if it is done unintentionally, but if one has access to the database with all rights, well, then he or she should be aware of this anyway.
    If I have to use a workaround in Kettle to implement this step, well, I can live with it as well.
    I just would like to understand why this step is not integrated: Is it mainly for the reason to avoid unintentional dropping of tables?

    Overall I have to say that you all have done a impressive work so far and that I follow this project with great interest!
    Best regards,
    Diddy

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.