US and Worldwide: +1 (866) 660-7555
Results 1 to 5 of 5

Thread: Exclude first lines of a CSV file

  1. #1
    Join Date
    May 2012
    Posts
    14

    Question Exclude first lines of a CSV file

    Hi

    I m working with some "almost CSV" files as input. the only diference btw those files and the "strict CSV" is that first lines (6) contains comentar, dates and so on (no structured at all). Only after the 6th position start the real deliminated structure.

    Does exist a way to ignore first lines, before treating the CSV structure, or at least a best practice or trick?

    Thanks
    F.
    Attached Files Attached Files

  2. #2
    Join Date
    Apr 2007
    Posts
    1,938

    Default

    is that absolutely always constant?
    I would just use tail on the file before reading it in PDI. If thats absolutely not possible then you could always csv input it as one massive string, ignore first 6 rows, write to another file and then load that file.

  3. #3
    Join Date
    Nov 2008
    Posts
    199

    Default

    Good tip by codek.
    Another possible solution, provided files consistency.

    - text file input, with a single field; in "Content" tab check "Rownum in output" and provide a name
    - filter row: rownumber > 6
    - split fields step: provide a delimitator ("|" in your case) and give fields a name and the rest of metadata

    Would it be useful?
    Andrea Torre
    twitter: @andtorg

    join the community on ##pentaho - a freenode irc channel

  4. #4
    Join Date
    May 2012
    Posts
    14

    Default

    Hi

    I tried the Ato's solution and its working. A bit anoying to be obliged to define manualy each field in the "split step", but it s working great

    Codek: I like your proposal but does exist a way to do this Tail direcly in the Kiddle or do you mean "tail" the file before transfer the file to Keddle?

    Thanks to both
    F.

  5. #5
    Join Date
    Apr 2008
    Posts
    1,758

    Default

    When this has come up before, I have suggested making a template file:
    Take a sample file and strip out the lines of offending data, leaving the good headers in place.
    Build your Text File Input step, using these good headers and the sample data.

    Adjust the skip lines and point the Text File Input to your good file.

    It works quite cleanly.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •