Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: CSV encoding issue

  1. #1
    Join Date
    Sep 2008
    Posts
    25

    Default CSV encoding issue

    Hello, I am trying to work with some very large files (40 million rows per file) and I am trying to read the input in parallel. I tried this using the text file input and it seemed to multiply the number of records byt the number of copies I started.

    I tried switching to the CSV input and running in parallel but it is not using the encoding I choose. It seems to be similar to the jira issue PDI-1306 but I am on a build that should have this resolved (Spoon version 3.1.0 build version 826).

    Does anyone have any suggestions as to why it would not work correctly and or and suggestions on a work around?

    Thanks,
    Pete

  2. #2
    Join Date
    Sep 2008
    Posts
    25

    Default

    As a test I updated to the kettle 3.2 build version 3.2.0-GA and it appears that the CSV file input is still having a problem. It is using the default encoding (ISO 8859-1 on windows, UTF-8 on linux). Regardless of what I change it to it still uses the default.

  3. #3
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Please be so kind as to file a bug report with a test case.
    Thank you!

    Matt

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.