Hitachi Vantara Pentaho Community Forums
Results 1 to 6 of 6

Thread: Modifications to 'Merge Join' Transform

  1. #1
    Join Date
    Sep 2007
    Posts
    3

    Default Modifications to 'Merge Join' Transform

    Currently thisTransform has 2 inputs as follows

    'First Step and 'Second step' - My assumption is that this transform will only allow 2 input steps at any given point when it is used. However in some scenarios one would like to input more than 2 TableInputs or any other input from different Database Connections.

    Understandably this will be abit of an overhead to the engine because the join will be performed by the Pentaho Server. I am not sure if I am driving my point properly.

    Chat later. Cheers.

  2. #2
    Join Date
    May 2006
    Posts
    4,882

    Default

    Request denied ... if you want to use that, use multiple merge joins in sequence. You can also enter feature requests at http://jira.pentaho.org

    For your request, I have no clue what the semantics would be for a merge join with more than 2 inputs. I think your request is not doable.

    Regards,
    Sven
    Last edited by sboden; 09-18-2007 at 05:45 PM.

  3. #3
    Join Date
    Sep 2007
    Posts
    3

    Default

    Its cool sboden, was not fighting . I understand. In any case, the aim of the request was tryiing reduce the number of sequence merge Joins that 1 had to use.

  4. #4
    Join Date
    May 2006
    Posts
    4,882

    Default

    It may sound good... but for merge join it wouldn't even work. The trick of merge join is that you have 2 sorted inputs and you advance either through one or the other, until you reach end. With e.g. 3 inputs what's a changed record on output?

    Regards,
    Sven

  5. #5
    Join Date
    Sep 2007
    Posts
    3

    Default

    My thinking was the following.
    The main driver would be the following definition.

    1. Define your main Input source to drive the Condition or Where Clause
    2. Define all other Inputs, this is where my theory was coming into play, that the 2nd input can be exploided into many inputs.
    3. Define your join conditions

    For Example
    Main Input : Source1
    Other Inputs : Source2, Source3, Source4
    Condition : Source1.ID = Source2.ID
    and Source1.ID = Source3.ID
    and Source1.ID = Source4.ID

    Resolution
    Advance through 'Main Input' for all Inputs in 'Other Inputs' using join specified in condition statement. In this case the advance would apply only to 'Main Input'.

    Granted, I have not as yet looked at the Merge Join source code to make a better case of my point. I will be downloading it today and have look and maybe I will also be seeing things in your view.

    Nevertheless, thanks for having the time in evaluating the possibility.

  6. #6
    Join Date
    May 2006
    Posts
    4,882

    Default

    So for now use merge joins in sequence. Personally I would also only allow 2 hops (as it now) to keep it as simple as possible. Both from GUI perspective as source code wise... you will see when you look at the code ;-)

    Regards,
    Sven

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.