Hitachi Vantara Pentaho Community Forums
Results 1 to 4 of 4

Thread: Merge join / Database join slowness

  1. #1

    Default Merge join / Database join slowness

    Hello All,
    In my project i hv pass incremental data from source table A based on date criteria and related its other tables into destination table.
    Now i m fetching suppose 7000 data from Table A and fetching its related data (almost 7000 data) from Table B(in that table having total 9 lac records) using DATABASE JOIN step
    and it took 8 minutes to fetch.
    Instead of DATABASE JOIN i took MERGE JOIN but merge JOIN is taking more time when passing number of records 100.

    Here i can not use SQL join in INPUT step.

    Please advice any other option for this.

    Thanks.

  2. #2
    Join Date
    Nov 2008
    Posts
    777

    Default

    Most likely, the problem is in your database so PDI probably won't be able to fix it all by itself. Fetching 7000 records usually goes very fast. If you provide more information perhaps someone can help.

    What type of database are you using? Is the database server on the same computer or somewhere out on a network? What is the datatype of the columns you are trying to join on? Are the join columns also the primary key columns? Are their any indexes defined? How wide is each table? How many columns are you trying to fetch?

    Also, what do you mean by "total 9 lac records"?
    pdi-ce-4.4.0-stable
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

  3. #3
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    10 lac = 10 lakh = 1 million

    Merge join does a table scan, I understand.
    I would try a Database lookup step.
    So long, and thanks for all the fish.

  4. #4
    Join Date
    Nov 2008
    Posts
    777

    Default

    So a lac is 100000? I learn something new everyday...

    Obviously, indexing is a fairly important design consideration for a table with 9 lac rows.

    Moreover, I agree that a Merge Join step is not really a good option for this application. Either the Join needs to be done at the database server or, as marabu suggested, use a Database Lookup step (with appropriate caching).
    Last edited by darrell.nelson; 09-19-2012 at 10:47 AM.
    pdi-ce-4.4.0-stable
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.