Hitachi Vantara Pentaho Community Forums
Results 1 to 6 of 6

Thread: Merge Rows

  1. #1
    Join Date
    Oct 2012
    Posts
    108

    Post Merge Rows

    Hi Team,

    I am using the following transformation as attached below.

    Is there any limitation on the data that the 'Merge Rows' component can process.

    If the 'Source' and 'Target' have large data , will the transformation take more time.?

    Can you please provide some details on the same.My 'Source' and 'Target' tables currently have around 1 lakh rows each.

    Thank you in advance.

    Name:  Transformation_Used.jpg
Views: 53
Size:  5.5 KB

    Regards,
    Naseer.

  2. #2
    Join Date
    Apr 2008
    Posts
    1,771

    Default

    Is there any limitation on the data that the 'Merge Rows' component can process.
    As far as I know, there is no limitation. It all depends on your hardware.

    If the 'Source' and 'Target' have large data , will the transformation take more time.?
    Yes.

    My 'Source' and 'Target' tables currently have around 1 lakh rows each.
    If 1 lakh is 100,000 then it's not a problem at all (unless you're running PDI on a 5 years old laptop).
    -- Mick --

  3. #3
    Join Date
    Oct 2012
    Posts
    108

    Default

    Hi Mick,

    Thank you very much for the detailed reply. It was very helpful.

    Also request for one small clarification:

    In the 'Merge rows' component, Is each iteration a complete dump of both the 'Source' & 'Target' tables is taken and then merged?
    So is this the reason for the transformation to take more time ( Ex: to compare 100,000 rows in 'Source' & 100,000 rows in 'Target' tables.). Any alternative for the same to better performance.?. Thank you.

    Thank you in advance.

    Naseer.

  4. #4
    Join Date
    Apr 2008
    Posts
    1,771

    Default

    In the 'Merge rows' component, Is each iteration a complete dump of both the 'Source' & 'Target' tables is taken and then merged?
    I don't really understand your question, but I'll give it a try.

    If your 2 tables are in the same database, then you can create a SQL query in your Table Input step which would be quicker.

    Keep in mind that when you deal with DB, the speed of your transformations is affected by many different components - network speed, db configuration, server hardware and so on..
    I'm sure that other users have better suggestions.
    -- Mick --

  5. #5
    Join Date
    Oct 2012
    Posts
    108

    Default

    Hi Mick,

    Thank you very much for your reply.

    What i mean is like:

    Whenever we run the transformation, in the 'Merge rows' component -
    'Is a complete dump of both the 'Source' & 'Target' tables is taken and then merged?'
    or
    only the differential rows as the component is 'Merge rows (diff)' and i am not clear if there is any reason for the 'diff'.
    Just wanted to know what actaully happens in the 'Merge Rows' and why its given as 'diff'? .

    Thank you very much in advance for all the replies and help.

    Naseer.

  6. #6
    Join Date
    Apr 2008
    Posts
    1,771

    Default

    Hi Naseer,
    from reading: http://wiki.pentaho.com/display/EAI/Merge+rows
    I assume that all rows from both input steps are passed to Merge Row.
    But you can easily check this looking at the log tabs - I think it's the Step Metrics - you can see for each steps how many rows are read/written - in and out..

    Note that those records should be sorted before the Merge Rows step.
    Doing a sort within the "Table Input Step" can cause some problems, therefore be careful!
    -- Mick --

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.