Hitachi Vantara Pentaho Community Forums
Results 1 to 4 of 4

Thread: Caching a query?

  1. #1

    Question Caching a query?


    I'm new to ETL and Pentaho and currently looking at using Pentaho to load data into an Oracle database.

    The data I load is in the form of a CSV file and for each row of data I look up whether the database has currently stores the data. If it does I return an id or use a stored procedure to return a new id.

    I would like to be able to run a stored procedure/execute a query to retrieve a set of data which I can then reference as I transform each row of the CSV file.

    I've tried using a table input/database look up in the transform but this seems to run each time a row is processed. I'd like to run a query, store the dataset and then requery this dataset for each row of the csv file.

    Is this possible, or am I barking up the wrong tree here. Any other suggestions to avoid a massively chatty to the db transform?

    Many thanks for taking time to read/reply,

  2. #2
    Join Date
    May 2006


    Look at , in your case it's either database lookup or stream lookup (for small quantities) that you to launch queries per row. For executing a query you can use DB Call Procedure step.


  3. #3


    Thank you for your reply it was most helpful.

    I've been playing around with the Database lookup but have two issues with it.

    1) I need to execute a query and not have to compare values to another table. I'd like to select rows within one table with a where clause on two values. No other table invloved.
    2) Database lookups require input from a previous step, so I've tried to use the generation of a row as an input step. This appears limited as I need to generate enough rows to cover the query (which I don't know at this first step).

    I know I'm being really stupid here, any pointers on how I can essentially run a query, cache the result so I can requery the result in following steps?

    Many thanks for your time and patience

  4. #4

    Default For completeness.

    I've worked out what I was doing wrong, I approached the problem by attempting a query and then referencing the query as a disconnected dataset.

    By using the merging functions I've been able to build up a table that maps well and isn't chatty to the database.

    Its a change of mindset from coding to transforms.


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.