Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: Parallell Process Streams

  1. #1
    Join Date
    May 2007
    Posts
    22

    Default Parallell Process Streams

    Hi,
    I am reading data from my staging file. The file contains data for a fact. I am looking up dimensions to get the technical keys. I have to look up abut 10 dimensions. Is there a difference in performance if I do the look ups one after another or if I do the lookups in parallel (by copping the same row to different lines). Are different lines multi threaded?

    Thanksm,
    Gavin

  2. #2
    Join Date
    May 2006
    Posts
    4,882

    Default

    If you only do lookup it should work in parallel (pick the right copy/distribution mode of the hops)... and keep in mind that the order of the rows afterwards is undetermined.
    In transformations every step runs concurrently in its own thread.

    Personal experience is that it's better to pipeline them... use them in sequence. I start parallellizing stuff after I expect more than 400.000 a 500.000 rows.

    Regards,
    Sven

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.