Hitachi Vantara Pentaho Community Forums
Results 1 to 9 of 9

Thread: The Weekly Kettle Tip: Quick Kettle Pointers

  1. #1
    Join Date
    Nov 1999
    Posts
    9,729

    Default The Weekly Kettle Tip: Quick Kettle Pointers

    This weeks Kettle tip was not written by me or one of the Kettle developers. It comes from people in Daytona Beach (FL) that have been using Kettle for a while now. I'm reposting their experiences and pointers unchanged (with permission) in the hope that it will help others.
    ------------------------------------------------------
    Pointers using Kettle.
    Setting
    We were new to using Kettle and want to use kettle as an ETL tool to load a
    datawarehouse from 50 different sources.

    Mapping
    1st misconception - Assumed mapping meant mapping one field to another (i.e.
    pulling data from a source table and mapping it to a destination table).

    To peform this type of Mapping use the Select Value Step (this allows you to
    extract data from source and pass it on to the successor step (i.e. map
    input field name to output field name)).
    Constant values
    Use of Constant Step - allows you to map a fixed value to an output step
    (i.e. say you have a field STATE and you want to put FL in each record in
    the destination table, you would use a constant step).

    In the constant step you can map many constants/fixed values with one step.
    Table Output
    Not all field in the destination table have to have a value put into them.

    The tool may give you warnings (in the validation step), but its not
    necessary to have every field in the destination have a value being passed
    it.
    Commit size
    To improve performance of transformations, add the value to the commit size
    in the table output step.


    Example:



    If commit size is specified an appropriate size in the "Table Output" step,
    this will cause major boost to performance of a transformation.

    We were testing with 5000 records, using the default size of 0, it took 160 seconds.

    Increasing the size to 1000 made the process run in 3.5 seconds.



    ------------------------------------------------------


    I hope you found these tips useful and if you have others, feel free to post them here, or send them over to me.



    Kind regards,

    Matt

  2. #2
    Join Date
    Sep 2005
    Posts
    1,403

    Default RE: The Weekly Kettle Tip: Quick Kettle Pointers

    Hi Matt, We had a requirement of merging multiple columns into a single column in the output. For instance in the table input, the hour, min and time are stored in separate columns. We need to merge all three together with another column that contains only date and put the result in a single output column. Could you let me know how to go about this? I tried with the Execute SQL Script step with a to_date command, but it didn't work out.
    Thanks, Pradeep

  3. #3
    Join Date
    Sep 2005
    Posts
    1,403

    Default RE: The Weekly Kettle Tip: Quick Kettle Pointers

    Hi Matt,
    Thanks for tips.Questions did align with what we have requirements ,but the answers werent understandable.
    when there is a need to pull data from muliple datasources(tables present in different kind of databases),what is the
    approach a developer is expected to take ?

  4. #4
    Join Date
    Nov 1999
    Posts
    9,729

    Default RE: The Weekly Kettle Tip: Quick Kettle Pointers

    Just make sure that before mixing the rows from the different sources, the layout of the rows are exactly the same. Use "Select Values" steps to do so.
    When in doubt, run in safe mode.

    Matt

  5. #5
    Join Date
    Mar 2007
    Posts
    16

    Default

    Hi, I am really interested to know specifically how Constant Values is done. I am using Kettle 2.50 and I don't see any step called Constant Values. There's an Add constants, but I'm not sure how it works.

    Basically I have a stream of data, when finished processing I would like to update all the columns which are blank or null to be of a fixed value that I would predefine (or hardcode, it doesn't really matter). Right now it's trivial to implement with a single SQL statement, but our end users wouldn't be able to accomplish this. It would be best if they could accomplish the same result using the graphical tools provided in Kettle.

    Thanks in advance for any help you could provide!

  6. #6
    Join Date
    May 2006
    Posts
    4,882

    Default

    Use calculator step with the NVL functionality e.g.

    Regards,
    Sven

  7. #7
    Join Date
    Mar 2007
    Posts
    16

    Default

    Hi,

    Do you mind elaborating? NVL requires two fields. I'm looking to insert a constant. Do I hardcode a value into Field B?

  8. #8
    Join Date
    May 2006
    Posts
    4,882

    Default

    Naa... thinking too much we follow Oracle in functionality

    Attached a transformation which I think does what you want using Value mapper. You can preview each of the steps to see how it works.

    Regards,
    Sven
    Attached Files Attached Files

  9. #9
    Join Date
    Mar 2007
    Posts
    216

    Arrow

    Quote Originally Posted by superdx View Post
    Hi, I am really interested to know specifically how Constant Values is done. I am using Kettle 2.50 and I don't see any step called Constant Values. There's an Add constants, but I'm not sure how it works.

    Basically I have a stream of data, when finished processing I would like to update all the columns which are blank or null to be of a fixed value that I would predefine (or hardcode, it doesn't really matter). Right now it's trivial to implement with a single SQL statement, but our end users wouldn't be able to accomplish this. It would be best if they could accomplish the same result using the graphical tools provided in Kettle.

    Thanks in advance for any help you could provide!
    Hi,

    How "add constants" step works :
    Starting with,
    -"add constants" step input : 2 columns
    And using,
    -"add constants" step configuration : name="myNewColumn" type="String" value="Hello"
    Will give you,
    -"add constants" step output : 3 columns, column "myNewColumn" have all lines with "Hello".

    a+, =)
    -=clément=-

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.