Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: Transformation Settings

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Join Date
    Sep 2005
    Posts
    1,403

    Default Transformation Settings

    Hi,


    I want to make a Transformation that pulls data from a Relational DB using a Table Input and delivers the rows to an ODS db using a Table Output.



    I am trying to figure out the Spoon.pdf (p. 23) which deals with "Maxdate" options.



    The overall description for the Transformation Settings states this is how you can "define the Date Range for the source table".



    My source table has two audit columns in use a "audit_create_dtm" & "audit_update_dtm", both are timestamps. The create column is Not nullable and has a value for every row. The update column Is nullable and may have a value.



    I usually use an NVL(audit_update_dtm, audit_create_dtm) to figure out which audit column to use in order to see if that row meets the ETL Load Window (data that needs to be pulled and loaded into ODS).



    I also have an "ETL_Load_Window" table that contains the past ETL Start and End Dates. I have used this table to figure out my Load window in the past by getting the last runs End Date and making that the Start Date for the next ETL run.



    The sql which is run for that window then get's adjusted, the "where" clause ends up looking something like:



    (NVL(audit_update_dtm, audit_create_dtm) > TO_DATE('2006/06/06 00:00:01', 'yyyy/MM/dd HH:mm:ss'))



    I am trying to figure out how to achieve a similar concept using the transformation settings "Maxdate" options?



    I am not sure if the Maxdate table would end up being the actual source table or if It would be my "ETL_Load_Window" table?



    Also if it is the the source table how can I use Null Value check as I have two audit columns?



    Any ideas much appreciated.



    Thanks in advance!

  2. #2
    Join Date
    Sep 2005
    Posts
    1,403

    Default RE: Transformation Settings

    This is very close to what I need:


    On the date-range, I advice you to:



    1) use >= and <
    2) use a kettle date range (start & end of date range from get system info)
    You need to use a logging table to set actually use this.



    This way you never lose data, even if the transformation could not be run for several days.



    I f I have a transformation that uses the get system info can I then set the Start & End for the ETL window into variables that are "root job" visible and use the variables in my Where Clause?



    I think so ...



    Thanks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.