Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: Improve table input step performance on Oracle

  1. #1
    Join Date
    Oct 2014
    Posts
    26

    Question Improve table input step performance on Oracle

    I have some ETLs and identified the Table input is the bottleneck causing degrading the overall performance.
    Table input step speed is 60-80 rows/second on a row with just 4 number and 1 date (I suppose no more than 40 bytes/row).

    The very low performance is probably caused by the high latency on this network. The table input is used to extract information from a remote database (other institution).

    Executing the same query using another software (Toad) on the same client machine could read 500 rows/second.
    I was figuring out how to configure the Oracle connections better on PDI to improve the table input step performance.
    For example, using JDBC-OCI instead of pure JDBC and/or specifying greater fetch size to reduce the round-trips.

    As I read on Oracle documentation, the default row-prefetch is 10:
    * https://docs.oracle.com/cd/A97335_02...4/oraperf2.htm
    "If you do not set the default row-prefetch value for a connection, 10 is the default."

    As suggested on PDI Wiki, we´ve configured the defaultRowPrefetch in Options tab:
    * https://wiki.pentaho.com/display/EAI/Oracle
    "Performance Considerations: Standard Fetch Size and Row Prefetching
    In case you have a slow connection and/or your latency is high (e.g. via a VPN), prefetching data into the client reduces the number of round trips to the server and you can specify the number of rows to fetch with each trip.
    Set the defaultRowPrefetch and fetchSize properties in the database connection dialog."

    But this configuration didn´t change the performance.
    I supposse the configuration was ignored because PDI has a Fixed Database fetchsize and oracle documentation inform to not mix fetch size and oracle prefetching API. If PDI has a 5000 fixed database fetchsize, the row-prefetch value was not used as the fetch size.

    PDI Fixed Database fetchsize:
    * http://jira.pentaho.com/browse/PDI-2458
    * http://jira.pentaho.com/browse/PDI-5491

    Oracle documentation informing to not mix fetch size and defaultRowPrefetch:
    * https://docs.oracle.com/cd/A97335_02...4/oraperf2.htm
    "Do not mix the JDBC 2.0 fetch size API and the Oracle row-prefetching API in your application."

    Row-pretech value is used as the default fetch size:
    * https://docs.oracle.com/cd/E11882_01...tm#JJDBC28621:
    "Standard JDBC also enables you to specify the number of rows fetched with each database round-trip for a query, and this number is referred to as the fetch size. In Oracle JDBC, the row-prefetch value is used as the default fetch size in a statement object. Setting the fetch size overrides the row-prefetch setting and affects subsequent queries run through that statement object."

    We´ve also changed from JDBC to JDBC-OCI but the table input step performance didn´t changed.


    Does anyone already configured the defaultRowPrefetch (EXTRA_OPTION_ORACLE.defaultRowPrefetch) succesfully?

    Do you have any suggestion on this case?

  2. #2
    Join Date
    May 2014
    Posts
    358

    Default

    Yes, we have used the defaultRowPrefetch and performance was in the 10 000s of rows/s. In Spoon, where you configure the connection properties, on the Parameters tab put oracle.jdbc.defaultRowPrefetch as the parameter name and the number of rows to prefetch as the value (we used 5000).
    But if you only get ~100 r/s, the row prefetch may not be the bottleneck.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.