Hitachi Vantara Pentaho Community Forums

View Poll Results: Have you successfully used the Hadoop Input/Output steps and Copy Files Job Entry?

4. You may not vote on this poll
  • Yes, I have successfully used the steps.

    2 50.00%
  • No, I have tried to use the steps but I'm running into issues.

    1 25.00%
  • No, I have not had a chance to test the new steps yet.

    1 25.00%
Results 1 to 2 of 2

Thread: Have you successfully tried the new Input/Output steps and Copy Files job entry?

  1. #1

    Default Have you successfully tried the new Input/Output steps and Copy Files job entry?

    Quick poll to see how everyone's testing is going...

  2. #2
    DEinspanjer Guest


    Was able to do some streaming reads of files stored in HDFS.

    Had issues with it not being able to display the format of the fields when doing a "Get Fields".

    Got an error when trying to browse HDFS to find part-r-###### files. Still worked though.

    <strike>Alright performance. About 26k RPS reading and writing through a single copy</strike>

    Actually, my first run was processing data using only one HDFS reader and one Vertica Table Output step. The output step was the bottleneck there.
    On my dual quad core xeon server, I was able to bump the HDFS readers up to 3 and use six Table Output steps before saturating IO on the server.
    The total throughput at that point was 78k RPS.
    I processed 30M records (221 HDFS files) in 6 minutes.
    Last edited by DEinspanjer; 07-20-2010 at 06:19 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.