Hitachi Vantara Pentaho Community Forums
Results 1 to 8 of 8

Thread: Does Kette has a Unit Testing Framewok for ETL, like JUnit for Java?

  1. #1

    Question Does Kette has a Unit Testing Framewok for ETL, like JUnit for Java?

    Hello everybody,

    I have been looking for a unit test framework or tool for doing test on the transformations and jobs design but i did not found =(. I looked into everything!

    I appreciate if someone could help me.


    Thanks in advance,

    Eduardo

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Eduardo, we do have a number of things that we test:

    http://source.pentaho.org/svnkettler...rg/pentaho/di/

    The closest to a framework comes the black box test set. That one reads http://source.pentaho.org/svnkettler...iles/blackbox/ and processes all transformations there, matches results against golden data.

    Good luck,
    Matt

  3. #3

    Default

    As Matt pointed out, when you are dealing with data transformation with external datasources, it's more the blackbox/test file and expected result file approach.

    The only thing you could do 'easily' is take a transformation or a job, replace all the external datasources with an equivalent CSV file, run it through, and for a 'unit' test approach have an already prepared final file and do a 'diff' against the resulting CSV file and your expected CSV file.

    This is similar to the blackbox tests Matt linked above, and no, there is no easier way -- someone has to make a test data set and someone needs to make the expected outcome dataset and then do a 'diff' of the dataset to help evaluate success or not.

  4. #4
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Please note that as part of the KFF project, we created a "Table Compare" step that allows you to compare two relational database tables. Such a thing can be used to compare data between the test and the production system. However, you could also compare against golden data if you set up things correctly.
    Whatever may be, you are looking at extra work in the ETL design phase to get your unit tests set up correctly.
    Where it works best is for the re-usable components like mappings and simple parameterized transformations.

  5. #5

    Wink Thank you

    Thanks very much for your help,

    I will try to use that.


    Eduardo

  6. #6
    Join Date
    Mar 2011
    Posts
    1

    Default

    Hello,

    As Matt pointed out, we used the framework for blackbox testing of our transformations. The BlackBoxTests.java file can be used when the output is CSV file, which can be compared with the expected output file (CSV) stored in the same location.

    However, some of our transformation has the output as the database import. Is there any way, we can connect to database using the same java code (BlackBoxTests.java) and compare the tables ?

    Thanks,
    Maulik

  7. #7
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Comparing 2 database tables isn't as easy as it seems. Detecting missing and new rows especially is tricky.
    However, the table compare step mentioned above does that and more. Perhaps you can simply parameterize a transformation with that step and execute it after the back box test.

  8. #8
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Comparing 2 database tables isn't as easy as it seems. Detecting missing and new rows especially is tricky.
    However, the table compare step mentioned above does that and more. Perhaps you can simply parameterize a transformation with that step and execute it after the back box test.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.