Hitachi Vantara Pentaho Community Forums
Results 1 to 4 of 4

Thread: Any way to speed up Serialze to file?

  1. #1

    Default Any way to speed up Serialze to file?

    Are there any known ways to speed up the Serialize to file step? OR is there a good way to basically write all of your rows to a temp file so that another step can read them in? I am writing about 1million rows to a file (about 10mb total) and it's taking like 10min. I noticed that the serialize to file step only has a queue of 200 rows which may be causing most of the slowdown. So is there a way to set the queue size of a step to possibly speed it up?

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Serialize to file is actually the fastest way to serialize data. Perhaps there is another reason why it's slow.
    Perhaps the target disk is on a slow network drive or perhaps another step is the slowest.

  3. #3

    Default

    You were right Matt, I had two database lookup steps before that were dropping the queue size down from 10k to 200, I changed these to table input steps and used a stream lookup, therefore taking the db traffic down to one hit and it keeps the queue size at 10k. This almost cut the run time in half (515seconds before to 290 now).

    I'm guessing that the serialize to file step was showing 200 rows waiting but it was actually being held up due to the rows coming in slowly.

  4. #4
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    It writes out a GZipped file. That in itself buffers a lot actually.
    For fast disks it probably would be a good idea to support non-compressed binary output as well.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.