Hitachi Vantara Pentaho Community Forums
Results 1 to 10 of 10

Thread: Involuntary trimming with Text file input?

  1. #1
    Join Date
    May 2015
    Posts
    5

    Default Involuntary trimming with Text file input?

    Hi, I'm a student using Kettle to try to preprocess some txt files and separate them.

    The process works fine, and it separates the categories just fine, however, Kettle seems to be trimming the longer lines on its own initiative.

    For example, a line that in the "Get fields" GUI shows up as 270 chars long is shown fine. I've set the field to have 350 and there's also no trimming. However, the actual file, and the "Preview Rows" GUI is trimming them.

    Anyone have an idea?

    Thanks!

    Regards

  2. #2
    Join Date
    Apr 2008
    Posts
    1,771

    Default

    I would not trust the "Preview".
    Create your output file and check that.
    -- Mick --

  3. #3
    Join Date
    May 2015
    Posts
    5

    Default

    Quote Originally Posted by Mick_data View Post
    I would not trust the "Preview".
    Create your output file and check that.
    Hi, I did indeed do that, and the preview was correct. The strings are getting trimmed for some reason I cannot gather. Would it help if I uploaded the files?

  4. #4
    Join Date
    Apr 2008
    Posts
    4,690

    Default

    One of the prompts with the "Get Fields" process asks how many lines to sample.
    If the longest value of these sampled lines for the field is only 270 characters, then PDI will assume that the field is only 270 characters.

    The "Get Fields" button of the Text File Input is there to HELP you, not do all the work for you.

  5. #5
    Join Date
    May 2015
    Posts
    5

    Default

    Hi,

    I have indeed run the process, and the preview is on target, the strings get cut. Would uploading a file help?

    @Gutlez

    Like I said, I've set the field manually to 350 instead of 270 to account for longer strings on error and it had no impact? Did I do it wrong somehow?

  6. #6
    Join Date
    Apr 2008
    Posts
    4,690

    Default

    Then I think we'd have to see the files to understand what is going on.

    There may be some other thing going on, like having enclosures within the string, but the files will make it easiest to figure out.

  7. #7
    Join Date
    May 2015
    Posts
    5

    Default

    Hi again,

    I managed to speak with the professor of the course, showed him the example. It seems there was a /n in the middle of a few lines instead of the expected escape character since they were too long, and he had hit enter by mistake. I managed to regex it fixed with notepad++ and now it works ...

    Sorry for the mistake, and thanks for your help!

  8. #8
    Join Date
    Apr 2008
    Posts
    4,690

    Default

    Quote Originally Posted by PIDK View Post
    I managed to speak with the professor of the course
    If PDI is being assigned in a course, the instructor should be one of the first places to ask for support
    I'm not saying that the forum shouldn't be a support channel... just that details like this \n should already be known by your instructor.

  9. #9
    Join Date
    May 2015
    Posts
    5

    Default

    I'm sorry, he only has 1 hour a day available during the day and since I could find nothing on google and It was still 4 hours off I thought I could ask here. I apologize for wasting your time.

  10. #10
    Join Date
    Apr 2008
    Posts
    4,690

    Default

    Don't apologise!
    Stick around and show us what you've been able to do - the fact that you found us, and were willing to ask for help is actually a good thing!

    I was just commenting that the instructor should have been one of the first places, and here should likely have been #2. Most of the users here expect that this is inter-user support of professional work, and provide support expecting that that is the end goal -- getting something into production. When it comes to course-work, it will likely never be put into a production system, which might get a little bit less priority, but will still be supported here.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.