Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: non fixed-length file parsing

  1. #1
    Join Date
    Aug 2007
    Posts
    1

    Default non fixed-length file parsing

    pls. correct me if i am wrong,
    to my understanding this ETL tool can process the flat files that are of fixed length.
    I have a file which is fixed-length but has repetive headers and footers alongwith some strings in between. it's required to clean it up before being processed by teh ETL tool. Howwould i handle such text files...pls. guide me if at all the this tools provides solutions for that.

    Thanks & Regards

    - Ashis

  2. #2
    Join Date
    May 2006
    Posts
    4,882

    Default

    If they are all fixed width lines, pre-process the file using PDI, reading in whole lines in 1 field and filtering out the parts you don't want.

    Then process the resulting files again with PDI doing the actual split up in fields.

    PDI can do a header at the top and a footer at the bottom, but not complex stuff where headers are somewhere in between the real lines.

    Regards,
    Sven
    Last edited by sboden; 08-21-2007 at 09:09 AM.

  3. #3
    Join Date
    Jun 2007
    Posts
    138

    Default More information

    Can u ellaborate more on this?

    How can I put Fix header into File output?If I can...

    The same query is posted here...

    http://forums.pentaho.org/showthread...ghlight=HEADER
    Regards,
    kedar.mehta@tcs.com ,
    Tata consultancies Ltd

  4. #4
    Join Date
    May 2006
    Posts
    4,882

    Default

    Read with text file input using 1 field which is as long as your input record. Use e.g. a javascript step to filter out any lines you don't want (e.g. either containing a specific indication, or having spaces in the wrong places).

    Possibly split up 1 file in multiple files in which you put in all data of the same type.

    Reread the files and then parse them properly.

    Regards,
    Sven

  5. #5
    Join Date
    Jun 2007
    Posts
    138

    Default Fix Header in the file output

    But wat If,we want to have header in the File Output Step?

    Is it possible?
    Regards,
    kedar.mehta@tcs.com ,
    Tata consultancies Ltd

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.