Hitachi Vantara Pentaho Community Forums
Results 1 to 4 of 4

Thread: How shall upload always modified and new inserted records from source to destination

  1. #1
    Join Date
    May 2007
    Posts
    9

    Question How shall upload always modified and new inserted records from source to destination

    I have read all the Kettle related doc but not found any content to look into my issue.

    I have two small databases MySouceDB and MyDestDB and have a table called EMPLOYEE. I want to transfer always only modified and new inserted records from source to the destination database table.

    I have created a transformation but that have only 2 options

    option1: Whether I have to truncate my destination table always and then transform data

    Option2: OR always i have to insert all the records in destination without truncating so in this case i will have duplicat records.

    I have limitation that i can not truncate data always for transformation because the data is huge and it takes lot of time if i truncate and then pump the data.

    Is there any option where i can only transfer the modified and new inserted records to the destination table?

  2. #2

    Default

    Hi Kumar,

    I do incremental update (only get new & modified records).
    For example, for a table CUSTOMER i added 2 more columns (Create_date & Modified_Date)
    Each time record is changed, a trigger update Modified_Date.

    When i extract i simply filteron that column.

    SELECT .... FROM CUSTOMER WHERE Modified_Date >

    At this first extract stored_date=01/01/1900 and after each update of the destination table, i update stored_date= getdate() (or sysdate).

    Another way is to extract all data and use Insert/update or Dimension Update....But it fit small table :-)

    Samatar

  3. #3
    Join Date
    May 2007
    Posts
    9

    Default

    Thanks for your kind response. I am agree with havinf modified column in source table.

    But i have ine concern here lets we use a select statment like

    Select customer_code, customer_name,....From Customer Where modified_date >= 01/01/2005.

    at the time of transformation how will I check that the particulare records needed to be changed and particulare record is needed to Insert intot he table.

    please suggest.

  4. #4

    Default

    When i run the full update process i save the stored_date (in fact actual date).
    Next time, i extract all record that have MODIFIED_DATE > stored_date (real modified records + new records) and after the process, i update stored_date = sysdate, etc.


    In the destination Db, i use Insert/Update step . From 2.5, you can use Table output + Update step (by using error handling)

    Rgds

    Samatar

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.