Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: Parallel execution of queries against same database

  1. #1
    Join Date
    Dec 2010
    Posts
    193

    Question Parallel execution of queries against same database

    Hi,

    I have several table input steps running in parallel, which are connected to same SQL Server database and writing it to separate flat files inside the same folder.The configuration details of database and the text file are coming from variables. My problem is , If 2 flows Table input ---> TFO starts at the same time, 2 queries are being fired to two different tables of say 1000k records in each, but time for reading from the database and usage of buffer memory to write the flat file differs. In simple, bot process starts at same time , but there is a variance in Table input 1 reads from DB table1 to Table input 2 reads from DB table2 (approx 2-5 mins) and it effects in writing to flat file(time consuming).

    How to make them run parallel and fast ?
    Sathish
    Back to Pentaho


    'Be the best Pearl in the ocean of wisdom'

  2. #2
    Join Date
    Jul 2009
    Posts
    476

    Default

    It sounds like you're trying to run multiple Table Input steps in the same transformation. If so, try putting them into separate transformations, and call the transformations in parallel from a single job.

    When you want everything to be parallel and fast, that includes your SQL Server, the PDI transformations, your file system, and potentially network connections between them, so if it's still slower than you want, then any one of them could be the reason.

  3. #3
    Join Date
    Dec 2010
    Posts
    193

    Default

    Rob,

    I have found that it is the queries sent to MS SQL Server and the fetchinf from the table input takes more time accordingly. Is there anyway to optimize that ?
    Sathish
    Back to Pentaho


    'Be the best Pearl in the ocean of wisdom'

  4. #4
    Join Date
    Jul 2009
    Posts
    476

    Default

    It could be lots of different things. The SQL might be written sub-optimally, you might need to add other indexes on the table(s), the database server itself could be a slow piece of hardware, the network between you and the database server might be slow, etc.

    The Table Input step just takes your SQL and runs it on the database server, and the database server returns the results to your transformation. So the problem is likely to be somewhere outside of PDI.

  5. #5
    Join Date
    Dec 2010
    Posts
    193

    Default

    agreed
    Sathish
    Back to Pentaho


    'Be the best Pearl in the ocean of wisdom'

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.