Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: Testing Pentaho Transformations

  1. #1
    Join Date
    Apr 2012
    Posts
    16

    Default Testing Pentaho Transformations

    We're starting to use Pentaho for quite a few things in our company, and as a result of that, we really need to get a testing methodology set up for our various transformations. While I realize we can test entire transformations by creating fake endpoints with dummy data and assessing the results, we would like a way to test the internal logic of transformations independent of the endpoints as well. After all, it's a lot of throw away work to set up entire mock databases and web services to prove 5 lines of JavaScript work.

    To give an example of how we'd like to test, let's say we have a transformation that:

    1 -Gets some rows from a database
    2 -Manipulates fields to form some xml
    3 -Sends XML to web service
    4 -Gets response and analyses it with Java Script
    5 -Filter Rows based on the content of a field formed in Java Script step (acts like if statement)
    6 -Executes switch statement on rows passing success condition of filter
    -etc.

    We'd like to test 2, 5, and 6 without having to consider 1 and 3.

    So, is there some kind of testing framework for Pentaho that would let us either test a subset of the steps in a transformation, or let us test an entire transformation with some steps substituted with test logic?

    If not, is it possible (with some work) to load and run a single step without having to run a whole transformation (walking me through the basic overview of that would be appreciated)? For example, it might be nice to load the "Manipulates fields to form some xml" JavaScript step, give it some fake input, and assess the output. This would be similar to unit testing a single function with JUnit.

    Any other ideas or input is appreciated, and thank you!

    -John Humphreys from Thomson Reuters

  2. #2
    Join Date
    Nov 2008
    Posts
    777

    Default

    As far as I know there is no testing framework. One thing you could do, though, is create parallel paths for certain areas and disable the "live" path when you want to run the "test" path. All you have to do is enable one hop and disable the other. For instance, at the beginning of your transformation you could have both a Table Input step and a Data Grid step connected to your #2 above. In test mode, you just disable the Table Input hop and enable the Data Grid hop. I do this often on the back end of transformations by running the second-to-last step to a Excel Output step and a Table Output step in parallel. Once I'm satisfied with the rows being dumped to the spreadsheet, I turn off the Excel path and turn on the Table Output path. If at some point I have doubts about the rows being sent to the database I can reverse course easily and go back to Excel.

    Other than that, you may need to modularize your transformation and coordinate it with a Job. That way you could kind of "plug and play" different transformations together. For instance, one that reads a database could be replaced with one that just generates test data.
    Last edited by darrell.nelson; 08-30-2012 at 02:44 PM.
    pdi-ce-4.4.0-stable
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

  3. #3
    Join Date
    Apr 2012
    Posts
    16

    Default

    Thank you for the response.

    I like the idea regarding having test paths integrated, and we have modularized our transformations as much as possible in our jobs, so we could apply it fairly well (and we have in some areas actually).

    I was hoping that we could achieve similar testing abilities as one would when using a programming language (Java, for example) for ETL though. It's generally considered bad practice to have test code in with real/shipped code, and I think that's probably true with Pentaho too. It's too easy to accidentally click and enable a route, or to have copy/distribute set wrong, or to forget to update a test path.

    I guess that's why I was wondering if it was possible to load and test individual steps in a transformation. It would be cool if I could load and test a javascript or filter step as if it was a function, then I could use JUnit to verify everything as if it was real code

    Quote Originally Posted by darrell.nelson View Post
    As far as I know there is no testing framework. One thing you could do, though, is create parallel paths for certain areas and disable the "live" path when you want to run the "test" path. All you have to do is enable one hop and disable the other. For instance, at the beginning of your transformation you could have both a Table Input step and a Data Grid step connected to your #2 above. In test mode, you just disable the Table Input hop and enable the Data Grid hop. I do this often on the back end of transformations by running the second-to-last step to a Excel Output step and a Table Output step in parallel. Once I'm satisfied with the rows being dumped to the spreadsheet, I turn off the Excel path and turn on the Table Output path. If at some point I have doubts about the rows being sent to the database I can reverse course easily and go back to Excel.

    Other than that, you may need to modularize your transformation and coordinate it with a Job. That way you could kind of "plug and play" different transformations together. For instance, one that reads a database could be replaced with one that just generates test data.

  4. #4
    Join Date
    Nov 2008
    Posts
    777

    Default

    I actually don't think it would be that difficult to write a little java code that goes in and tests individual steps, especially non-I/O steps. Perhaps one of the experts at Pentaho has some insight?
    pdi-ce-4.4.0-stable
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

  5. #5
    Join Date
    Aug 2012
    Posts
    5

    Default

    why not use YAML input step to load environment variables from a yaml text file, then use some FILTER ROWS (true/false) steps to drive the flow to prod or testing steps?


    http://rhnh.net/2011/01/31/yaml-tutorial

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.