Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: best way to cache data to use among different transformations

  1. #1
    Join Date
    Feb 2017
    Posts
    13

    Default best way to cache data to use among different transformations

    I have a transformation that reads millions of user ids from a mega data store.

    I would like to store those ids in a list or hashmap.

    I have about a dozen of other transformations.
    Each of those transformations gets input data (user ids) from other distinct child data stores.

    What I would like to do is, in a UDJC, as I get the user ids from the child data stores, somehow check if each user id is already in the mega user id list.
    How can I create/incorporate a list of mega user ids that I can use in my UDJCs?

    Thanks.

  2. #2
    Join Date
    Feb 2017
    Posts
    13

    Default

    by lack of responses, I'm figuring there is no caching way...

    My solution has been to serialize the mega user ids to a file in one transformation and de-serialize the file in subsequent transformations.

  3. #3
    Join Date
    May 2014
    Posts
    341

    Default

    Ah, I saw your post a few days back and wanted to reply, but for some reason my account was blocked at the time. I wanted to suggest using the built-in HSQL database http://www.nicholasgoodman.com/bt/bl...mory-database/
    But apart from that, saving the data to a DB or a file works too.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2017 Pentaho Corporation. All Rights Reserved.