Hitachi Vantara Pentaho Community Forums
Results 1 to 6 of 6

Thread: PDI is Very Slow when trying to acces the repository

  1. #1

    Exclamation PDI is Very Slow when trying to acces the repository

    When trying to access the repository located in the cloud. PDI is very slow. Engine database is Oracle 12C. spoon and version is:Engine database is Oracle 12C. spoon and version is:
    Kettle - Spoon General Availability Release 5.2 .0.0Engine database is Oracle 12C. spoon and version is:
    Kettle - Spoon General Availability Release 5.2 .0.0

  2. #2
    Join Date
    Sep 2013
    Posts
    235

    Default

    Yes it is slow, especially if it is JCR repository implementation (EE repository). Even it is not a network issue or itself JCR. Even consider that JCR (apache Jackrabbit) implementation focus on full JCR support, not on a performance. Some-day it may be fixed - but not for 5.2.0.0 now. Just my private opinion.

  3. #3

    Default

    The version of Spoon is the community and the connection is to a cloud database .
    Yet the problem is as you say ; must live with the problem . Or there is another alternative.

  4. #4

    Default

    The version of Spoon is the community and the connection is to a cloud database .
    Yet the problem is as you say ; must live with the problem . Or there is another alternative.

  5. #5
    Join Date
    Apr 2008
    Posts
    146

    Default

    The JCR does have issues and will get slower over time. There is a garbage collection issue too, where the more you work with it, the more fragments it leaves in your cloud database.
    I would investigate using VFS and file based repositories if possible for ETL. There is no real gain in using jackrabbit for the purpose of saving ETL.

    If you want to vote for fixing / improving the JCR, here is a link
    http://jira.pentaho.com/browse/BISER...0collection%22

  6. #6
    Join Date
    Sep 2013
    Posts
    235

    Default

    The reason using JCR was to support versioning, entities relations in any direction and access demarcation. I could imagine the idea is that BI server already uses JCR - DI (Spoon ETL engine repository) server should use JCR too. That would be very useful - to display report (hold in JCR) we have to run ETL (and in ideal case all data integration code is deployed in tomcat container instead of separate jetty known as carte) loaded from same JCR too - so reports, ETL's are hold in one same place. Sharing same access rules and others.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.