Hitachi Vantara Pentaho Community Forums
Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: Executing a KTR from a database or webservice

  1. #1

    Default Executing a KTR from a database or webservice

    Say, in a job, we want to excute a transformation.

    Instead of being in the repository or in a file, this transformation (which basically is a large XML string) is in a database. (Alternatively, the KTR XML could be looked up via a web service call.)

    I'm looking for a way to accomplish this. The only way I think could work is to pull the XML from a database (or web service), create a temporary file and pass that file location to the Execute Transformation step. And repeating this for all transformations.

    Any other ideas?

    It would be nicer to solve this in the stream, without temporary files in between. Essentially, a step like "Execute row SQL script" but then for transformations.

    Background info: we're thinking of automating the creation of staging ETL in our data warehouse management tool (Quipu). We already create all ETL SQL template driven. Theoratically we could quite easily create a template that creates KTR/XML for Kettle to load our staging area. But then we would also need a way to execute this in Kettle.
    Last edited by Nexus; 09-23-2010 at 09:04 AM.

  2. #2

    Default

    No ideas?

    Basically I'm looking to execute a transformation not from a file, not from a repository, but from a field in the stream.

  3. #3
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    OK, so you have the transformation XML as a String.
    Executing that with a little bit of JavaScript should be easy to do.
    However, a User Defined Java Class might be more comfortable.
    Let me post a sample for you...

  4. #4
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    OK, here is a UDJC sample that executes a transformation stored in a String:

    Execute a transformation stored in a String.ktr

    Code:
    import java.util.*;
    import org.pentaho.di.core.xml.*;
    import org.pentaho.di.trans.*;
    import org.w3c.dom.*;
    
    private int yearIndex;
    private Calendar calendar;
    
    public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws KettleException 
    {
      Object[] r=getRow();
      if (r==null)
      {
        setOutputDone();
    	return false;
      }
    
      String ktr = getInputRowMeta().getString(r, "xml", null);
      Document doc = XMLHandler.loadXMLString( ktr );
      Node transNode = XMLHandler.getSubNode(doc, TransMeta.XML_TAG);
      
      TransMeta transMeta = new TransMeta(transNode, null);
      Trans trans = new Trans(transMeta, getTrans());
      trans.execute( null );
      
      // Just pass the input along
      //
      putRow(getInputRowMeta(), r);
    
      return true;
    }

  5. #5

    Default

    Thanks, this is exactly what we are looking for.

    Going to try it out now...

  6. #6

    Default

    Is it true this User Defined Java Class doesn't inherit the available repository connections from the 'parent' transformation?

    We only use connection names in the genrated KTR, this way when a connection already exists in the repository, that connection is used by Kettle. This works when manually executing this KTR as a file.

    Looks like the connections cannot be found when this KTR is executed from a UDJC.

    -edit-
    Reading this tutorial, section Accessing Database Connections...
    Last edited by Nexus; 10-08-2010 at 10:52 AM.

  7. #7
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Sorry, I didn't know that. You can pass the repository interface into the TransMeta constructor to get it done:

    Try this:

    Code:
    ...
    TransMeta transMeta = new TransMeta(transNode, getTrans().getRepository());
    ...

  8. #8
    Join Date
    Oct 2006
    Posts
    7

    Default

    Thanks, now it works. We're now querying the Quipu repository for the generated ktr xml and use your UDJC to kick of the transformations. Cool. The next step would be using the REST interface to get the generated ktr's. Thank's a lot for the quick help.

    Salud,
    JJ.

  9. #9

    Default

    Another question; it seems that the UDJC spawns several KTR's and runs them simultaneously.

    While this 'spawned' ETL is still running, the 'mother transformation' says it's completed.

    Is it possible to check whether all UDJC invoked transformations have finished?

    I'd like to only finish the mother transformation when all UDJC's are finished. Else we'll have problems with our flow, since the transformations after this expect the previous transformations to have completed.

  10. #10
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Ah yes, you can add a "trans.waitUntilFinished();" call after "trans.execute();"

    Good luck,
    Matt

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.