Hitachi Vantara Pentaho Community Forums
Results 1 to 4 of 4

Thread: comparing content of two files

  1. #1
    Join Date
    Nov 2011
    Posts
    22

    Default comparing content of two files

    Hi,

    I am relatively new to Pentaho and would really appreciate if I could get any help with this. I was working on Spoon and was wondering if there is any way I could compare the content of two csv files containing information like firstname, middlename, lastname and the output should give out those names that match in both the files. I know we have File Compare which compares two files or Merge Rows(diff). Is there anyway I could get the names that are present in both files written to an output file.

    Also, is there a way to compare more than 2 files at once.

    Thanks in advance.

  2. #2
    Join Date
    Jun 2011
    Posts
    102

    Default

    Hi, you can use merge row(diff) to compare your files. you must sort every file on the same field and be aware that the 2 files must have the same column number (you can use "select value" step to select the fields you need).
    The merge diff add a field at the end of the data stream (it say if the same record are identical, if a record is new or is "deleted"): put a filter rows and select only the "identical" record.
    Sorry for my bad english, hope that helps,
    Andrea

  3. #3
    Join Date
    Nov 2011
    Posts
    22

    Default

    Thank you for the reply. How would I sort the file when the information within is names??? Also, Is there anyway I can compare the content of more than two files for eg. if I have three files say Files A,B and C, I would like to compare the contents of File A with both B and C and get the output as I had wanted earlier. Can we do something like this with Pentaho?
    Last edited by malavika; 11-16-2011 at 05:10 PM.

  4. #4
    Join Date
    Jun 2011
    Posts
    102

    Default

    Hi,
    compare more than 2 files at once now is not possible, however you can compare in the same transformation A with B, A with C and then compare the result of the 2 stream.
    In order to use Merge (Diff) you need to sort the 2 stream, so order by lastname, middlename, first name.
    If you can provide some example i'll have a look
    Bye,
    Andrea

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.