Hitachi Vantara Pentaho Community Forums
Results 1 to 10 of 10

Thread: JOBS: change a parameter in the call by terminal

  1. #1
    Join Date
    Nov 2015
    Posts
    15

    Default JOBS: change a parameter in the call by terminal

    Hi everyone!
    I'm currently using the kettle of pentaho to load data from hive to mysql.
    The problem is that I do it with the spoon's GUI to make the changes. But now, I need to do by terminal.
    I was locking about kitchen and its jobs load, but I need to change some parameters by terminal befor load the job (not using an interface as usual) and I dont know/see nothing about it in kitchen.
    So, someone knows something or have some guide/tutorial for change a paremeter by terminal from a JOB ?
    Thank you so much.

  2. #2
    Join Date
    Nov 2008
    Posts
    271

    Default

    Hi,
    couple of options:
    1) use a variable in a bash (or batch if windows) script that call the kitchen launcher
    2) write a wrapper java class that run the job via api

    Here a trivial example with a wrapping shell script:

    Code:
    #!/bin/sh
    
    
    for PARAM_VALUE in 1 2 3
    do
        /path/to/pdi/kitchen.sh -file=/path/to/job.kjb -param:my_parameter=$PARAM_VALUE
    done
    where my_parameter is defined in the job.

    HTH
    Last edited by Ato; 11-16-2015 at 12:58 PM. Reason: add bash example
    Andrea Torre
    twitter: @andtorg

    join the community on ##pentaho - a freenode irc channel

  3. #3
    Join Date
    Nov 2015
    Posts
    15

    Default

    Quote Originally Posted by Ato View Post
    Hi,

    2) write a wrapper java class that run the job via api

    Here a trivial example with a wrapping shell script:

    Code:
    #!/bin/sh
    
    
    for PARAM_VALUE in 1 2 3
    do
        /path/to/pdi/kitchen.sh -file=/path/to/job.kjb -param:my_parameter=$PARAM_VALUE
    done
    where my_parameter is defined in the job.

    HTH
    Thank you so much Andrea Torre! That's seems work, but I have one problem too (I hope the last one). I need to connect with a remote hive host.
    It is possible in the same call do the connection adding some parameter? or I need open the job with the spon GUI and aply a default connection? (I'm not sure is this seconf choice will work).
    Thank you for your time!
    Last edited by Alsrroum; 11-16-2015 at 06:29 PM. Reason: Thank you for your time!

  4. #4
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    If the steps support variables in the configurable places that you want (eg. hostname, username, password), then you can use parameters, and in your shell script, you can just keep adding -param: items to configure them.

    If you know that some items will not be changing between runs, and you want to make it clearer, you can also specify some values (eg. SourceDBHostIP) in the .kettle/kettle.properties file. These also present as variables within your transformation.

  5. #5
    Join Date
    Nov 2015
    Posts
    15

    Default

    Quote Originally Posted by gutlez View Post
    If you know that some items will not be changing between runs, and you want to make it clearer, you can also specify some values (eg. SourceDBHostIP) in the .kettle/kettle.properties file. These also present as variables within your transformation.
    Hi gutlez, in the kettle.properties, I only have this 3 values predefined.

    Code:
    # PRODUCTION_SERVER = hercules
    # TEST_SERVER = zeus
    # DEVELOPMENT_SERVER = thor
    I'm trying to find an API or something with all the posibles config values without exit, anyways IF I put SourceDBHostIP = IP in the kettle.properties.. dont I need to put these proprety as a param in the call?
    Thank you.

  6. #6
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    In the simplest case, you have one kettle.properties per system.

    On your Production ETL system, it would have:
    DB_SERVER=hercules

    On your Test ETL system, it would have:
    DB_SERVER=zeus

    On you Development ETL system, it would have:
    DB_SERVER=thor

    When you create your PDI transformations, you would reference ${DB_SERVER} as the host to pull the data from. When you move that transformation from Dev to Test, the different kettle.properties file would cause the transform to automatically change from pulling data from thor to pulling data from zeus.

    If you have something defined in your kettle.properties, you do not need to also declare them as parameters to the transformation. If they exist in your kettle.properties before you start spoon, then when you press Ctrl-Space in variable supporting fields (for example in the host field of the DB configuration), you will see your properties listed.

    Using properties to set up your DB configurations is considered to be a "Good Practice" in the PDI world, as you will likely have multiple different transformations that all use the same properties. If something changes (say for example the password to the ETL DB user...), you only need to change it in the properties file, and it changes for all your transformations.

    It can get trickier (more complex) if you want it to... It is possible to have different configurations (if you can't arrange distinct ETL systems for Dev, Test, Prod) on the same system, but they need to have different KETTLE_HOME directories. If you want to go down that road, I would suggest looking at the Kettle Franchising Factory (even though it has been archived by Google Code), as it presents some clean ways of laying out your structure.

    I hope this helps!
    Keep coming back, and let us know how you're doing.

  7. #7
    Join Date
    Nov 2015
    Posts
    15

    Default

    Hi!
    I was testing in a old computer (windows XP) with the spoon.bat and I tested the conecction to Hive correctly (this computer have the driver installed) to copy and paste the ktr preconfigurated.
    But in the computer that I want to run the kettle (centos 7) throws me an error in the .ktr (I export the jobs and I see that each ktr have the correct configuration inside (I open thw ktr with comand "vi")).

    The error is this:
    Code:
    2015/11/18 00:02:53 - ERROR (version 6.0.0.0-353, build 1 from 2015-10-07 13.27.43 by buildguy) : A serious error occurred during job execution: 
    2015/11/18 00:02:53 - Couldn't find starting point in this job.
    2015/11/18 00:02:53 - ERROR (version 6.0.0.0-353, build 1 from 2015-10-07 13.27.43 by buildguy) : org.pentaho.di.core.exception.KettleJobException: 
    2015/11/18 00:02:53 - Couldn't find starting point in this job.
    I think that the problem will be this driver that I cant install: http://hortonworks.com/hdp/addons/ because is for centos 6 and centos 7 dont suppor one of the three libraries..
    Thank you, I will break my head more time with the problems.

  8. #8
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Every KJB file must have a "Start" in it. The error you are seeing says that there isn't one.

  9. #9
    Join Date
    Nov 2015
    Posts
    15

    Default

    Hi!
    The problem was:
    Quote Originally Posted by gutlez View Post
    Every KJB file must have a "Start" in it. The error you are seeing says that there isn't one.
    Now I'm trying to make a KJB from the KTRs. But the params that I change are from the KTRs inside the KJB. When I fix the problem I will told you the solutions.
    See u.

  10. #10
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Jobs can have parameters also.
    When the Job calls the Transformation, it can push its parameters down to the Transformation.

    If you have 2 or 3 Transformations that all need to run after each other, with similar parameters, you only need to set them once on the job.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.