Hitachi Vantara Pentaho Community Forums
Results 1 to 8 of 8

Thread: Insertion and updation problem how can i acheive this?

  1. #1

    Default Insertion and updation problem how can i acheive this?

    parse out patient information from Record_id segment to populate the Record_table
    1. If the Record_id is new, a new entry should be added
    2. Else the Record_d entry should be updated.

  2. #2
    Join Date
    May 2014
    Posts
    358

    Default

    There's an Insert/update step that does exactly what you want (for small data sets at least).

  3. #3

    Default

    ok any another method in pentaho for this..

  4. #4
    Join Date
    Apr 2008
    Posts
    1,771

    Default

    Filter records in 2 streams: 1. New records 2. Existing records.
    Then send stream 1 to a Table Output step and stream 2 to an Update step
    -- Mick --

  5. #5
    Join Date
    Aug 2015
    Posts
    313

    Default

    and you can use synchronization after merge with merge rows(diff), i hope it is also good one.

  6. #6
    Join Date
    Aug 2011
    Posts
    360

    Default

    Yep, or load data in a staging table with table output or best a bulk loader (should truncate the stagging table first)
    Then execute two SQL scripts:
    One which update the target table with existing record (inmer join)
    One wich then insert the records that are new (left outer join keeping only non natching records)

    You can even detect deleted records with a right outer join if you have the full data.

    If you have lot lot of records and an index on the key fields, this should be
    the fastest way to do it

  7. #7
    Join Date
    Mar 2015
    Posts
    190

    Default

    Quote Originally Posted by Mick_data View Post
    Filter records in 2 streams: 1. New records 2. Existing records.
    Then send stream 1 to a Table Output step and stream 2 to an Update step

    Mike, I am trying to understand your logic : in this way are you suggesting ? if yes i am facing problem here, in my incremental load / full load data processing i must and should use ID comparison , if my ID matches in source and target table then it should go to update step and if not matches then it goes to Table output why because it is new record.
    but my problem is when i use ID(data type is INTEGER) comparison then it is throwing error, what is the integer return value when they compared ? Please suggest me. Could you please post sample transformation

    Name:  filter.png
Views: 41
Size:  9.8 KB ---------------------------------------- Name:  filter --.png
Views: 42
Size:  11.5 KB

    i am using another approach Table input --> Update (error handling) --> Table output . ... loading properly.

    Name:  dm.png
Views: 40
Size:  6.4 KB

  8. #8
    Join Date
    Apr 2008
    Posts
    1,771

    Default

    Hi,

    the transformation that I would create (note that there could be better alternatives, as suggested in other posts) is like:
    Table Input for new data T1.
    Table Input for existing data T2.
    Both of them linked to a Stream Lookup step. Use T2 as the lookup and retrive a flag or a field which is always populated.
    From the Stream Lookup you can use the field from T2 to split records from T1 that matched and records that did not match.
    You now have 2 streams and you can direct one to update step and the other to table output.
    -- Mick --

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.