Hitachi Vantara Pentaho Community Forums
Results 1 to 8 of 8

Thread: How to unit test mapping transformations

  1. #1

    Default How to unit test mapping transformations

    Hey all,
    I have a mapping transformation (http://kettle.pentaho.org/tips/?tip=4) and would like to run unit-tests on this transformation (test through java code).

    However, I'm not sure how to create 'incoming datastream' before running the transformation nor how to check the values of the 'outgoing datastream'.

    Any ideas/samples please?

    Thanks - target is for Kettle 2.5.0/2.5.1, but adding comments for 3.0 would be helpful.

    -D

  2. #2
    Join Date
    May 2006
    Posts
    4,882

    Default

    A lot of effort for a Unit test, can't you use the mapping on a certain input and then work with a gold image approach or so?

    Even that can be implemented in java if required, e.g. by stripping the code of pan and doing some extra stuff.

    Regards,
    Sven

  3. #3

    Default

    I do not understand "gold image approach", so you lost me there.

    The intent is to test the transformation (that will be used in many other jobs) in some fashion and only works on datastreams - and, obviously, try different datastreams to get different results on the same transformation and validate the results.

    I suppose it could make sense (just thinking about it, need to test it) to have a 'unit-data-incoming' transformation and a 'unit-data-outgoing' transformation to read/write the datastreams from/to text files/database/etc., but it would be better to have something more dynamic so you can load many different types of data and define column names/data types.

    But then, at this point, you would need to wrap that around a job -- or could you daisy-chain the transformations in the unit test?

    Just trying to find more elegant ways (even if it is a lot of work up-front) to unit-test transformations. The tip example shows the challenge, the actual transformations that will be tested are much more complex -- but do have input/output that makes it relatively easy to check the results.

  4. #4
    Join Date
    May 2006
    Posts
    4,882

    Default

    Gold image approach... you make a specific input, you run them through whatever you want to test. Verify the output and call the output "golden image". Next time your run the application/transformation you check the new output with the gold image. If the new output and the golden image are the same it's ok, else it's not.

    How to integrate it... depends how far you want to go. E.g. in the test directory of the kettle sources there are examples that make complete transformations with the java interface, you could mix this with loading your mapping and then mixing it with your unit test... but it's not going to be 1, 2, 3 to create these kind of tests

    Regards,
    Sven

  5. #5

    Default

    Gold Image Approach = OCR answer files = rosetta file = 'expected result' comparison, gotcha!

    I started with the tests in the kettle sources, but they are targetted at specific steps and all the ones I saw had database connections they were swapping back and forth (and did not look like they checked the actual results, only functionality. There used to be UseCases -- but I'm not sure where they went (and the /trunk also has the comparison testing between 2.5 and 3.0, so trying to keep aware between those two).

    Hmm, ok, so this is a new precedence to use unit-testing to test solutions -- not the kettle code, but a user of Kettle having a way to unit-test their solutions. If I find a good way, will post.

  6. #6

    Default

    continuing this thread in case anyone else will read it in the future, and an example followed by a question.

    //=====test passing arguments (command line 1, command line 2, etc) ======
    String[] arguments = new String[]{"c:/ztest.csv"};

    // OK, now run this transFormation.
    be.ibridge.kettle.trans.Trans trans = new be.ibridge.kettle.trans.Trans(be.ibridge.kettle.core.LogWriter.getInstance(), transMeta);

    if( !trans.prepareExecution(arguments) ) {
    System.out.println("failed to prepare for exec");
    }

    trans.startThreads();
    trans.waitUntilFinished();
    //=====end test passing arguments======



    Question - how to I pass VARIABLES through java code that will be read by a transformation? I was checking out TransConfiguration/TransExecutionConfiguration, but 1) not sure if that is correct and 2) I guess I need an example of how to actually use those when running the transformation IF that is the correct approach.

    thanky,
    -D

  7. #7
    Join Date
    May 2006
    Posts
    4,882

    Default

    Look at the first 3 lines in the main method of kitchen in 2.5 (but only valid upto 2.5.x)

    In 3.0 you can use VariableSpace to inject variables in a job/transformation.

    Regards,
    Sven

  8. #8

    Default

    Quick TransformationRunner class to make it easier to unit-test (and start working on something better).

    Couldn't figure out how to pass datastreams, but was able to get the datastream outputs.

    License is LGPL, since I forgot to post it in the code :-P

    Only for Kettle 2.5.X series. Will not work for 3.0
    Attached Files Attached Files

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.