Hitachi Vantara Pentaho Community Forums
Results 1 to 9 of 9

Thread: Join Data from Different Sources

  1. #1
    Join Date
    Nov 2013
    Posts
    15

    Default Join Data from Different Sources

    I am trying to join some transformed data from different sources (Mostly from excels and "Get-Date" function) in order to create the final file with a custom layout of fields and data. The Excels have different number of columns and no similar fields, I tried to join with "Merge Join" after sorted their rows but the final file has random data (Example: first column has 10 rows of data, Second Column has 5 rows of data that start where the last row of the first column finishes, etc). Any idea how I can solve the problem?

    Name:  fields.jpg
Views: 57
Size:  19.1 KB


    Thank you in advance

    Best Regards
    Alexios

    (pdi-ce-4.4.0-stable)
    Last edited by Ksaderfos; 11-26-2013 at 10:14 AM.

  2. #2
    Join Date
    Nov 2008
    Posts
    777

    Default

    What is the join key? Is there a common field in each source? Is it just the row number?
    pdi-ce-4.4.0-stable
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

  3. #3
    Join Date
    Nov 2013
    Posts
    15

    Default

    Thank you for your reply, I don't have a common field, I tried to put "add sequence" in all files in order to match them but again the result was the same (maybe I am doing something wrong in that step).

  4. #4
    Join Date
    Nov 2008
    Posts
    777

    Default

    Can you post a simplified transformation and supporting test data files?
    pdi-ce-4.4.0-stable
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

  5. #5
    Join Date
    Nov 2013
    Posts
    15

    Default

    I am attaching you the transformation file. It needs Extraction to C:\ Directory to be operational.

    ELIXIR.rar

  6. #6
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    That's REALLY not a simplified example...

    Mock up the data that you will expect at Sort Rows 3, Sort Rows 4 and Sort Rows 6 and then do your Merge Joins.
    When those work, then build it back.
    **THIS IS A SIGNATURE - IT GETS POSTED ON (ALMOST) EVERY POST**
    I'm no expert.
    Take my comments at your own risk.

    PDI user since PDI 3.1
    PDI on Windows 7 & Linux

    Please keep in mind (and this may not apply to this thread):
    No forum member is going to do your work for you. We will help you sort out how to do a specific part of the work, as best we can, in the timelines that our work will allow us.
    Signature Updated: 2014-06-30

  7. #7
    Join Date
    Nov 2008
    Posts
    777

    Default

    Your example is indeed NOT simple. And in no way does it depict the 3-column table and the problem you wanted to solve in your post #1.

    In addition, I think you are not understanding what a Merge Join step actually does. You have no common keys but a FULL OUTER join specified, so the two row streams are essentially appended, more or less. Perhaps the Cartesian Join step is what you are looking for? Simplify your example files and maybe we can still help.
    pdi-ce-4.4.0-stable
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

  8. #8
    Join Date
    Nov 2013
    Posts
    15

    Default

    I had no intention to make things more complex but I thought that if you see the whole picture might understand better the transformation. I will try to keep only one part in order to keep it simple and I will re-read my Pentaho book about the merge join - and cartesian join. I am in the first steps in pentaho so forgive me for poor explanation of some functions I use.

  9. #9
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Nope - no apologies needed!
    You're doing well, just need to examine a few parts in specific.

    Keep with it, and you'll likely have Darrell, Marabu, Joao, Mick, and myself all here helping you on things. There's a LOT of smarts on this board, and you can be part of those smarts!
    Last edited by gutlez; 11-26-2013 at 08:06 PM.
    **THIS IS A SIGNATURE - IT GETS POSTED ON (ALMOST) EVERY POST**
    I'm no expert.
    Take my comments at your own risk.

    PDI user since PDI 3.1
    PDI on Windows 7 & Linux

    Please keep in mind (and this may not apply to this thread):
    No forum member is going to do your work for you. We will help you sort out how to do a specific part of the work, as best we can, in the timelines that our work will allow us.
    Signature Updated: 2014-06-30

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.