Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: Modified Java Step and Merge Join

  1. #1
    Join Date
    Mar 2006
    Posts
    8

    Default Modified Java Step and Merge Join

    New to Pentaho Data Integration (I - probably like many others - came over from SSIS).

    I have two questions I can't find answers for and could use some guidance with.

    First - I am using a Modified Java Script Value to take some JSON (coming from a MongoDB Input Step) to grab two values from each document and turn them into rows so that I can compare them with some data that lives an in Oracle database. I've got the row output working - but I can't seem to suppress outputting the original string so I end up with:

    <orig string>, val1, val2

    when all I really need is val1, val2.

    I'm sure there is just some setting/property I am missing, but I can't figure out what it is.

    Second - I am then feeding those rows into a merge join that is also getting data from a table input that is running a query against an Oracle system. I am using a full outer join - but the data set is controlled right now - and I should be getting full matches - but I am getting a mixture of matches and no matches - which I think has to do with the data getting to the merge join.

    Right now this is all in one transformation - should I be pulling the data and rows in separate transformations and then feeding them to a third where the join is done - or do I need blocking steps to make sure that the data is hitting the join all at the same time.

    Thanks in advance for any help and advice.

    Matt

  2. #2
    Join Date
    Mar 2011
    Posts
    257

    Default

    hey there. to get rid of fields you no longer need you can best use a select values step and only select the fields you need.
    for the merge join to work properly all data needs to be ordered ! this could explain your mixes of matches.
    so both the data coming from the database as the data from you modified java script value will have to be ordered the same way. and then the merge join will work like a charm

    Greetz,
    Hans

  3. #3
    Join Date
    Mar 2006
    Posts
    8

    Default

    Quote Originally Posted by hansva View Post
    hey there. to get rid of fields you no longer need you can best use a select values step and only select the fields you need.
    for the merge join to work properly all data needs to be ordered ! this could explain your mixes of matches.
    so both the data coming from the database as the data from you modified java script value will have to be ordered the same way. and then the merge join will work like a charm

    Greetz,
    Hans
    I'm sorting as part of the queries (using the same field and direct :-) ) so that should have that part covered. I'm still worried that the data is getting to the merge join at two different speeds so when it is looking for the match it can't find it. Am I misunderstanding how this would be working. When I saw the mismatches - I thought "well, maybe the data from Oracle is getting their faster than the Mongo data which needs to be transformed using the MJS" - but I'm not sure if that is correct or just an assumption (and misunderstanding) on my part.

    I can use the select values step to get rid of that - I hadn't thought of that.

    Thanks,

    Matt

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.