Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: Transformation step optimization

  1. #1

    Default Transformation - steps optimization

    Hi all,

    Anyone help me out of this. In my current transformation reading 4,00,000 rows from table input step and then doing database lookups in different tables, finally normalized data loading into destination table.

    Here, my concern it's taking lot of time to do the lookups and load the data.

    I want to know, How to optimize the transformation and make it little faster.
    Last edited by mageshkumarm; 12-17-2013 at 03:14 AM.
    Thanks,
    Magesh

    pdi-ce-4.4.0-stable
    java 1.7.0_25 (OpenJDK
    )

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Instead of doing, for example, 4M lookups in a 1M rows table, you sort both data sets in the database and use a "Merge Join" step. Even better, do the join(s) in a database?
    If you're lazy and the data set is small, load the data into memory, in the "Database Lookup" step or with a "Stream Lookup" step.

    HTH,
    Matt

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.