Hitachi Vantara Pentaho Community Forums
Results 1 to 8 of 8

Thread: ETL-Testing

  1. #1
    Join Date
    May 2008
    Posts
    9

    Question ETL-Testing

    Hi,

    I have two questions about testing my ETL-Process (concerning: Validate that data is transformed correctly; Completnesstests; Integrationtesting; Performance and Scalability; Regression Testing; Unit-Tests and so on..):

    1. Is there any standard for testing an ETL-process?

    2. ..or is there any other collection of techniques for checking my ETL-Results.

    I'm already searching for a while but I found only short descriptions (on websites or whitepapers) of serveral recomendations but no complete concept.

    It would be great if anyone could make some helpful suggestions.

    greetings,
    tkkg

  2. #2
    Join Date
    May 2006
    Posts
    4,882

    Default

    General answer is no.... there's no standard. And if you want to build it in automatically it will normally take you a huge amount of time.

    What usually happens is that people run the ETL on a copy of the productive database (or a slice of the copy of the productive databases) and then have someone verify the results.

    So mostly the trick is to be able to run the ETL against different database, e.g. using variables in the connections.

    Regards,
    Sven

  3. #3
    Join Date
    Oct 2008
    Posts
    22

    Default

    for sure, this is primarily a manual effort, but you still can document your test scripts and prepare companion sql scripts that can be run repetitively and return results in a log format

    test scripts can be developed to
    - validate slowly changing data policies
    - confirm source to target field level transformations for initial and incremental and runs

    - just find a representative subset of data that will expose variations in data.

    a test plan is also helpful. i have one i could dig up with some other ideas if that interests you.

    k

  4. #4

    Default

    If by "test", you mean test the data post-ETL...use sample of reports from your DSS/OLAP environment and compare the results with the data your code processed.
    Pentaho Data Integration CE 5.3.0.x
    JDK 1.7
    OS X Yosemite version 10.10.x
    MySQL 5.5.37
    Amazon Redshift
    Pacific Standard Time

  5. #5
    Join Date
    May 2008
    Posts
    9

    Default

    Many thanks for your replies!!!

    I try to collect, develop and verify methods for checking and testing ETL-process-results.

    This may contain
    -Concepts of Unit-testing, black-/white-/greyboxtesting, regression-tests
    -Methods on how to get access to the source-data and transformed data after each transformation-step for comparison with the result-data
    -Where and when is it possible or necessary to compare process-resultsets with the expected results (test-entry-points in pre-deployment and monitoring-points in post-deployment)
    -How to support the testing-team with (semi-)automatic test-skripts and identify what you have to test manually
    -…
    So this project focuses the practical application of the testing-part in the DWH-Lifecycle.

    I found a couple of whitepapers with good ideas/tips and some information in common dwh-literature.

    If anyone has additional ideas for ETL-testing-techniques or experiences in what could go wrong during these tests or some other practical tips I would be grateful if you post that here or send me a message.
    @kvogelan:
    I am very interested in your test-plan and your ideas about that topic.

    Greetings,
    tkkg
    Last edited by tkkg; 03-25-2009 at 04:13 AM. Reason: spelling

  6. #6
    Join Date
    Jul 2011
    Posts
    2

    Default ETL Testing

    Hello all,

    I also want to have more knowlegde when testing BI/DWH or should i say ETL. Any one who has template on how to do test script.
    On Test plan What is included

  7. #7
    Join Date
    Feb 2012
    Posts
    1

    Default

    Regarding the ETL testing template for test Cases .
    It is pretty much similar to general test cases template except
    -you need to add separate columns to include the script which is used to verify the logic.

    -For data completeness test cases,columns can be added for recording count of rows in source and target.

  8. #8
    Join Date
    Jul 2011
    Posts
    2

    Default Help with ETl Testing.Intrgration testing

    Hi all

    Our project is on software intergration, I have to test ETL Processes. Ho can i go through?

    How do we do batch loading?




    Quote Originally Posted by Patseg View Post
    Hello all,

    I also want to have more knowlegde when testing BI/DWH or should i say ETL. Any one who has template on how to do test script.
    On Test plan What is included

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.