Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: Question regarding test data

  1. #1

    Default Question regarding test data

    Email from Shingo:
    Is there any hive(hdfs) data to test hadoop compatible PDI?
    or, they should be based on data at own environment.
    and also which volume of data (such as Giga byte basis / Peta byte basis)
    would you suggest to use during testing?

  2. #2


    if you look under the samples\transformations\files of your PDI installation there are a couple small CSV files.

    For the most part, any delimited file should work fine for the Hadoop Text file input step.

    The Hadoop Copy Files Job Entry should work with any file type.

    Regarding data set sizes, we would love to know how it performs for you using the biggest data sets you have.


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.