Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: How to union parts of data from multiple files

  1. #1
    Join Date
    Dec 2015
    Posts
    5

    Default How to union parts of data from multiple files

    Hi,
    Please help me to solve next issue

    Fore example I have 2 (or more) csv files with next structure:

    file1.txt
    #abcd0001
    12345,1,1
    12345,1,1
    23456,1,1
    12345,1,1
    5678,1,1
    2345,1,1
    #abcd0003
    02345,1,1
    02345,1,1
    34567,1,1
    02345,1,1
    56789,1,1
    23456,1,1

    etc...

    file2.txt
    #abcd0002
    123456,1,1
    123456,1,1
    234567,1,1
    123456,1,1
    56789,1,1
    23456,1,1
    #abcd0003
    02345,1,1
    02345,1,1
    34567,1,1
    02345,1,1
    56789,1,1
    23456,1,1

    etc...

    I need to union this two files in one, something like this:
    result.txt
    #abcd0001
    12345,1,1
    12345,1,1
    23456,1,1
    12345,1,1
    5678,1,1
    2345,1,1
    #abcd0003
    02345,1,1
    02345,1,1
    34567,1,1
    02345,1,1
    56789,1,1
    23456,1,1
    02345,1,1
    02345,1,1
    34567,1,1
    02345,1,1
    56789,1,1
    23456,1,1
    #abcd0002
    123456,1,1
    123456,1,1
    234567,1,1
    123456,1,1
    56789,1,1
    23456,1,1


    #letters_digits
    - it's like name of sections. And I need to union similar sections from different files and put it in one line by line.

    Is it possible to solve in PDI?

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Absolutely yes, it's possible.

    The attached demo should get you going.
    Removal of an unwanted group header (#abcd003) is left as an exercise.
    Shouldn't be too hard if you understand the demo.
    Attached Files Attached Files
    So long, and thanks for all the fish.

  3. #3
    Join Date
    Dec 2015
    Posts
    5

    Default

    Ou! Thank you very mach, marabu!

    I'm absolutely new in ETL and PDI.
    And it's best what you give me the way for my training.
    I will try to delete duplicated rows. Probably with "unique rows" I think.
    Not expected what I'll get the answer so quick, it's cool, thank you!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.