Hitachi Vantara Pentaho Community Forums
Results 1 to 4 of 4

Thread: JSON Input doesn't work correctly when embedded

  1. #1
    Join Date
    Feb 2017
    Posts
    14

    Default JSON Input doesn't work correctly when embedded

    When I try to run a transformation with the JSONInput step in an embedded java application it doesn't function correctly. Basically it looks like it's parsed but doesn't get executed correctly. It works fine in spoon. Operating system MacOS 10.9.5.

    I create a new test transformation with just the JSONInput and set it to raise errors if the file is empty (unchecked the "Do not raise an error if no files"). When I run this transformation from spoon it correctly throws an error

    Code:
    2017/02/20 15:12:57 - JSON Input.0 - ERROR (version 7.0.0.0-25, build 1 from 2016-11-05 15.35.36 by buildguy) : No file(s) specified! Stop processing.
    But when I run it from within a simple test java app I get:

    Code:
    2017/02/20 15:06:51 - test - Dispatching started for transformation [test]
    2017/02/20 15:06:51 - test - Nr of arguments detected:0 
    2017/02/20 15:06:51 - test - This is not a replay transformation
    2017/02/20 15:06:51 - test - I found 2 different steps to launch.
    2017/02/20 15:06:51 - test - Allocating rowsets...
    2017/02/20 15:06:51 - test -  Allocating rowsets for step 0 --> JSON Input
    2017/02/20 15:06:51 - test -   prevcopies = 1, nextcopies=1
    2017/02/20 15:06:51 - test - Transformation allocated new rowset [JSON Input.0 - Copy rows to result.0]
    2017/02/20 15:06:51 - test -  Allocated 1 rowsets for step 0 --> JSON Input  
    2017/02/20 15:06:51 - test -  Allocating rowsets for step 1 --> Copy rows to result
    2017/02/20 15:06:51 - test -  Allocated 1 rowsets for step 1 --> Copy rows to result  
    2017/02/20 15:06:51 - test - Allocating Steps & StepData...
    2017/02/20 15:06:51 - test -  Transformation is about to allocate step [JSON Input] of type [JsonInput]
    2017/02/20 15:06:51 - JSON Input.0 - distribution activated
    2017/02/20 15:06:51 - JSON Input.0 - Starting allocation of buffers & new threads...
    2017/02/20 15:06:51 - JSON Input.0 - Step info: nrinput=0 nroutput=1
    2017/02/20 15:06:51 - JSON Input.0 - output rel. is  1:1
    2017/02/20 15:06:51 - JSON Input.0 - Found output rowset [JSON Input.0 - Copy rows to result.0]
    2017/02/20 15:06:51 - JSON Input.0 - Finished dispatching
    2017/02/20 15:06:51 - test -  Transformation has allocated a new step: [JSON Input].0
    2017/02/20 15:06:51 - test -  Transformation is about to allocate step [Copy rows to result] of type [RowsToResult]
    2017/02/20 15:06:51 - Copy rows to result.0 - distribution activated
    2017/02/20 15:06:51 - Copy rows to result.0 - Starting allocation of buffers & new threads...
    2017/02/20 15:06:51 - Copy rows to result.0 - Step info: nrinput=1 nroutput=0
    2017/02/20 15:06:51 - Copy rows to result.0 - Got previous step from [Copy rows to result] #0 --> JSON Input
    2017/02/20 15:06:51 - Copy rows to result.0 - input rel is 1:1
    2017/02/20 15:06:51 - Copy rows to result.0 - Found input rowset [JSON Input.0 - Copy rows to result.0]
    2017/02/20 15:06:51 - Copy rows to result.0 - Finished dispatching
    2017/02/20 15:06:51 - test -  Transformation has allocated a new step: [Copy rows to result].0
    2017/02/20 15:06:51 - test - This transformation can be replayed with replay date: 2017/02/20 15:06:51
    2017/02/20 15:06:51 - test - Initialising 2 steps...
    2017/02/20 15:06:51 - JSON Input.0 - Released server socket on port 0
    2017/02/20 15:06:51 - Copy rows to result.0 - Released server socket on port 0
    2017/02/20 15:06:51 - test - Step [JSON Input.0] initialized flawlessly.
    2017/02/20 15:06:51 - test - Step [Copy rows to result.0] initialized flawlessly.
    2017/02/20 15:06:51 - Copy rows to result.0 - Starting to run...
    2017/02/20 15:06:51 - JSON Input.0 - Starting to run...
    2017/02/20 15:06:51 - JSON Input.0 - Finished processing (I=0, O=0, R=0, W=0, U=0, E=0)
    2017/02/20 15:06:51 - test - Transformation has allocated 2 threads and 1 rowsets.
    2017/02/20 15:06:51 - Copy rows to result.0 - Finished processing (I=0, O=0, R=0, W=0, U=0, E=0)
    When I call getXml I get the wrong output for this step while the other transformations work fine.

    Code:
      <step>
        <name>JSON</name>
        <type>JsonInput</type>
        <description/>
        <distribute>N</distribute>
        <custom_distribution/>
        <copies>1</copies>
        <partitioning>
          <method>none</method>
          <schema_name/>
        </partitioning>
        <cluster_schema/>
        <remotesteps>
          <input>
          </input>
          <output>
          </output>
        </remotesteps>
        <GUI>
          <xloc>192</xloc>
          <yloc>144</yloc>
          <draw>Y</draw>
        </GUI>
      </step>
    I used the code for embedding PDI from the samples:

    Code:
    /*! ******************************************************************************
    *
    * Pentaho Data Integration
    *
    * Copyright (C) 2002-2013 by Pentaho : http://www.pentaho.com
    *
    *******************************************************************************
    *
    * Licensed under the Apache License, Version 2.0 (the "License");
    * you may not use this file except in compliance with
    * the License. You may obtain a copy of the License at
    *
    *    http://www.apache.org/licenses/LICENSE-2.0
    *
    * Unless required by applicable law or agreed to in writing, software
    * distributed under the License is distributed on an "AS IS" BASIS,
    * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    * See the License for the specific language governing permissions and
    * limitations under the License.
    *
    ******************************************************************************/
    
    
    import org.pentaho.di.core.KettleEnvironment;
    import org.pentaho.di.core.Result;
    import org.pentaho.di.core.exception.KettleException;
    import org.pentaho.di.core.logging.KettleLogStore;
    import org.pentaho.di.core.logging.LogLevel;
    import org.pentaho.di.core.logging.LoggingBuffer;
    import org.pentaho.di.repository.Repository;
    import org.pentaho.di.trans.Trans;
    import org.pentaho.di.trans.TransMeta;
    
    
    /**
     * This class demonstrates how to load and execute a PDI transformation.
     * It covers loading from both file system and repositories, 
     * as well as setting parameters prior to execution, and evaluating
     * the result.
     */
    public class RunningTransformations {
    
    
        
        public static RunningTransformations instance; 
        
        /**
         * @param args not used
         */
        public static void main(String[] args) {
    
    
            // Kettle Environment must always be initialized first when using PDI
            // It bootstraps the PDI engine by loading settings, appropriate plugins etc.
            try {
                KettleEnvironment.init();
            } catch (KettleException e) {
                e.printStackTrace();
                return;
            }
            
            // Create an instance of this demo class for convenience
            instance = new RunningTransformations();
            
            // run a transformation from the file system
            Trans trans = instance.runTransformationFromFileSystem("/Users/tkaszuba/SCM/closeit/ifrs9-impairment-etl/code/transform/test.ktr");
            
            // retrieve logging appender
            LoggingBuffer appender = KettleLogStore.getAppender();
            // retrieve logging lines for job
            String logText = appender.getBuffer(trans.getLogChannelId(), false).toString();
    
    
            // report on logged lines
            System.out.println("************************************************************************************************");
            System.out.println("LOG REPORT: Transformation generated the following log lines:\n");
            System.out.println(logText);
            System.out.println("END OF LOG REPORT");
            System.out.println("************************************************************************************************");
        
        }
    
    
        /**
         * This method executes a transformation defined in a ktr file
         * 
         * It demonstrates the following:
         * 
         * - Loading a transformation definition from a ktr file
         * - Setting named parameters for the transformation
         * - Setting the log level of the transformation
         * - Executing the transformation, waiting for it to finish
         * - Examining the result of the transformation
         * 
         * @param filename the file containing the transformation to execute (ktr file)
         * @return the transformation that was executed, or null if there was an error
         */
        public Trans runTransformationFromFileSystem(String filename) {
            
            try {
                System.out.println("***************************************************************************************");
                System.out.println("Attempting to run transformation "+filename+" from file system");
                System.out.println("***************************************************************************************\n");
                // Loading the transformation file from file system into the TransMeta object.
                // The TransMeta object is the programmatic representation of a transformation definition.
                TransMeta transMeta = new TransMeta(filename, (Repository) null);
                
                // Creating a transformation object which is the programmatic representation of a transformation 
                // A transformation object can be executed, report success, etc.
                Trans transformation = new Trans(transMeta);
                
                // adjust the log level
                transformation.setLogLevel(LogLevel.DETAILED);
    
    
                System.out.println("\nStarting transformation");
                
                // starting the transformation, which will execute asynchronously
                transformation.execute(new String[0]);
                
                // waiting for the transformation to finish
                transformation.waitUntilFinished();
                
                // retrieve the result object, which captures the success of the transformation
                Result result = transformation.getResult();
    
    
                // report on the outcome of the transformation
                String outcome = "\nTrans "+ filename +" executed "+(result.getNrErrors() == 0?"successfully":"with "+result.getNrErrors()+" errors");
                System.out.println(outcome);
                
                return transformation;
    
    
            } catch (Exception e) {
                
                // something went wrong, just log and return 
                e.printStackTrace();
                return null;
            } 
            
        }
        
    }
    Other transformations work, am I forgetting something or is this a bug? I can't imagine that nobody has ever tried to imbed a JSONInput transformation.

    p.s: I didn't set KETTLE_HOME as there are no parameters defined. It doesn't work with KETTLE_HOME properly set as well.
    Last edited by tkaszuba; 02-20-2017 at 05:09 PM.

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Strange.

    What if you put in a transMeta.setDoNotFailIfNoFile(false) directly after creating transMeta?
    So long, and thanks for all the fish.

  3. #3
    Join Date
    Feb 2017
    Posts
    14

    Default

    Thanks for the hint but I figured out what is happening. It turns out that when you call
    Code:
    KettleEnvironment.init()
    it only registers "native" plugins and JsonInput is not considered native even though it's included with the pentaho-kettle repository. Furthermore instead of throwing an exception that the plugin was not found it creates a "MissingTrans", basically a Dummy Transformation, which explains the strange getXml output.

    Code:
             StepMeta stepMeta = new StepMeta( stepnode, databases, metaStore );
             stepMeta.setParentTransMeta( this ); // for tracing, retain hierarchy
    
    
             if ( stepMeta.isMissing() ) {
                addMissingTrans( (MissingTrans) stepMeta.getStepMetaInterface() );
             }
    This feature is probably documented somewhere (I assume I didn't look hard enough) but it would have been enough to have one sentence in the embedding PDI sample to save me about a day of work. It would also be nice to have a parameter to specify if an exception should be thrown if a plugin is not found instead of creating a dummy transformation.

    https://help.pentaho.com/Documentation/7.0/0R0/0V0/020

    Quote Originally Posted by marabu View Post
    Strange.

    What if you put in a transMeta.setDoNotFailIfNoFile(false) directly after creating transMeta?

  4. #4
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Thanks for coming back to tell us.

    BTW: You can help to improve documentation by creating a Jira case.
    So long, and thanks for all the fish.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.