Hitachi Vantara Pentaho Community Forums
Results 1 to 4 of 4

Thread: File split in PDI

  1. #1
    Join Date
    Jan 2014
    Posts
    2

    Post File split in PDI

    Hi All,

    I'm new to this Pentaho ETL tool. I have a input file as in below format

    >>> 100 date
    abcdef
    cdefrg
    ktegdt
    >>> 200 date
    tgstsnhsj
    tshsnsjs
    tshsnsgs
    >>> 300 date
    hdhdhd
    hdhdhd
    dhhdhd

    I would like to split the above file into 3 files.
    file 1:
    >>> 100 date
    abcdef
    cdefrg
    ktegdt

    file 2:
    >>> 200 date
    tgstsnhsj
    tshsnsjs
    tshsnsgs

    file 3
    >>> 300 date
    hdhdhd
    hdhdhd
    dhhdhd

    Please help me in providing the solution.

  2. #2
    Join Date
    Apr 2008
    Posts
    1,771

    Default

    I think that there are 2 options:
    1. use a java step to parse the file, add the same ID to each group of rows and then use a filter rows step
    2. use a combination of filter rows, group by, add sequence steps.
    Sorry but don't have time to describe in details, but at least you've got something to think about :-)
    -- Mick --

  3. #3
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Better yet, let the TFO step do the "filtering" ...
    Attached Files Attached Files
    So long, and thanks for all the fish.

  4. #4
    Join Date
    Jan 2014
    Posts
    2

    Default

    Thanks!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.