Hitachi Vantara Pentaho Community Forums
Results 1 to 7 of 7

Thread: Reading in 100+ tables from database dynamically

  1. #1

    Post Reading in 100+ tables from database dynamically

    Hi guys,

    I've been wracking my simple brain on this one and can't figure it out.

    Ultimately I am attempting to retrieve all the data from various AS400 libraries, which will then be imported into some other application by the vendor.

    Each library has a varying number of tables (369 in one case), and obviously each table has a varying number of (differently named) columns. I would like to do a mass export of all the data and replicate it back into another form; MS Access, Excel, XML, CSV - pretty much any format that isn't hosted on a DB server, so the contracted developers can import the data into their system.

    There are only a few libraries that they need the data retrieved from, so I'm happy to do it a library at a time.


    And to that extent, I have attempted the following:

    Step 1 - Table Input:
    SELECT DISTINCT(TABLE_NAME) FROM qsys2.syscolumns where table_schema = 'SOME_TABLE_SCHEMA'

    Step 2 - Modified Java Script Value:
    var strVarName="tblName";
    var strVarValue=TABLE_NAME;
    setVariable(strVarName,strVarValue, "r");


    var strVarName2="schName";
    var strVarValue2=TABLE_SCHEMA;
    setVariable(strVarName2,strVarValue2, "r");

    Step 3 - Table Input
    SELECT * FROM "${schName}"."${tblName}" FETCH FIRST 10 ROW ONLY


    I'm then presented with the following error:

    Code:
    Unexpected error
     org.pentaho.di.core.exception.KettleDatabaseException: 
     An error occurred executing SQL in part [P Set values]:
     SELECT * from "MFDBF25"."QDDSSRC" FETCH FIRST 10 ROW ONLY
     
     offending row : [TABLE_NAME String(128)], [TABLE_SCHEMA String(128)]
     
     Error setting value #1 [TABLE_NAME String(128)] on prepared statement (String)
     java.lang.NullPointerException
         at java.lang.Thread.run (null:-1)
         at org.pentaho.di.trans.step.RunThread.run (RunThread.java:50)
         at org.pentaho.di.trans.steps.tableinput.TableInput.processRow (TableInput.java:143)
         at org.pentaho.di.trans.steps.tableinput.TableInput.doQuery (TableInput.java:235)
         at org.pentaho.di.core.database.Database.openQuery (Database.java:1800)
         at org.pentaho.di.core.database.Database.setValues (Database.java:871)
         at org.pentaho.di.core.database.Database.setValues (Database.java:1126)
         at org.pentaho.di.core.database.Database.setValue (Database.java:978)
         at sun.jdbc.odbc.JdbcOdbcPreparedStatement.setString (null:-1)
         at sun.jdbc.odbc.JdbcOdbcPreparedStatement.setChar (null:-1)
         at sun.jdbc.odbc.JdbcOdbcPreparedStatement.clearParameter (null:-1)
    I'm assuming this is because the java script will not update for each row coming through?

    Any thoughts on how I can modify this? Or ideas on a completely separate approach if needed?

    [EDIT]
    I assume I will also have the problem, if I get the above working, of how am I to output all that data? I will not know what all the field names will be either, so is there a way to dynamically output all the field data with appropriate headers?


    Thanks in advance everyone.

    - Dwayne
    Last edited by dhughes; 03-28-2014 at 01:26 AM.

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    You can't define and use a variable in the same transformation, so make it two and put them in a job.
    So long, and thanks for all the fish.

  3. #3

    Default

    Thanks marabu.

    I've given it a go, but I'm not sure how to go about iterating through the script for each returned row.

    For example, based upon the Steps written above:
    - For each row returned from Step 1, I would want the script to declare the variable, then have that variable used in Step 3 to retrieve data.
    - Rinse and repeat for each row that comes out of Step 1.

    I know I'll need to have Steps 1 and 3 in separate Transformations, but I'm really not sure where to put the script, so that it is iterated through as required.

    Edit: Unless there's some fancy way to provide the table names as input rows, that I was unaware of? Everything seems to say use a script.

    Thoughts?

  4. #4
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    So your first transformation would look like T1 consisting of the graph (Get Tables Names, Copy Rows To Result).
    T2 would sport the familiar Table Input and whatever output step you deem adequately.
    The real stuff is in the configuration settings of the T2 job entry.
    Don't forget to define the parameters in T2.

    Name:  T2.png
Views: 40
Size:  18.2 KB
    So long, and thanks for all the fish.

  5. #5

    Default

    Right. However, I still need to define the variables somewhere right?

    So if I'm using the following SQL in T2's Table Input:

    Code:
    SELECT * FROM "${schName}"."${tblName}" FETCH FIRST 10 ROW ONLY

    I need to be able to define ${schName} and ${tblName} somewhere, and those variables need to change for every row that is coming out of T1. However, if I put them in T2 (which is executing for every incoming row), then I can't use them in the Table Input step within T2 - at least, that's the way I understand it.

  6. #6
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Your variables really are parameters in this scenario.
    Open the transformation settings with Ctrl-T and enter the variable names on the Parameters tab.
    So long, and thanks for all the fish.

  7. #7

    Default

    marabu... You are a sanity-saver, my friend.
    I continue to forget about the small things like this, that just make my job that much easier. It didn't even click when you were leading into it earlier - so thank you for the reminder!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.