Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: Pentaho job/transformation deployment on different environment

  1. #1
    Join Date
    Apr 2013
    Posts
    7

    Default Pentaho job/transformation deployment on different environment

    Hi all,

    I am developing Pentaho Jobs & transformation using Pentaho Community edition 5.3. And using file based repository in current dev environment.
    However, I have never worked upon moving Pentaho development to various environment such as UAT ,PROD and thus not able to foresee challenges for the same.
    Can anyone please help me identify best practices and approaches to make Pentaho development compatible across environments i.e; DEV-UAT-PROD.

    Thanks in advance

  2. #2

    Default

    What I usually do is externalize what's different between the environments. The my firt step is to read those values and set them via "set variables". Throughout the different steps then I used ${variable} name.

    Regards, Hugo

  3. #3
    Join Date
    Jul 2009
    Posts
    476

    Default

    Define the parameters that need to be different in your kettle.properties files in each environment, and use those parameters with the ${parameter} syntax in your jobs and transformations.

  4. #4
    Join Date
    Aug 2011
    Posts
    360

    Default

    As everyone is saying, the best advice is: don't hardcode stuffs.

    Use the kettle.properties to put environnement variables for stuff depending on the environment. BUT ONLY for stuffs depending on the environnment, no projects-related variables.

    - for filepath: define variables in kettle.properties for all your root network storages. At work we have only one ${DATA} variable that is the root of all our filesystem read and write. Maybe you'll have many of them.
    Then in jobs, use input parameters for the filepath (we put it always relative to the ${DATA}, the main job is responsible to build the full path. Just to be sure data endsup somewhere we know!).
    Or use a configFile parameter on your job pointing to a config.properties file for your job config.

    - for DB connections: I recommand to use JNDI, such that you don't bother settings all hosts, ports, users, passwords variables for each connections, but only a JNDI name variable. Also, once on the UAT / PROD server, maybe you will be using DI server on tomcat, so the best is to let your administrator configures the JNDI DB Pools and then juste configure your JNDI name in the kettle.properties.
    So, you will use the jdbc.properties file in the simple-jndi directory for your DEV, then use the server context.xml to configure tomcat JNDI pool OR if you're not using DI server, use also a jdbc.properties file with simple-jndi.

    - for email, FTP, webservices etc, you'll have to set variables in the kettle.properties or on each config.properties of each projects. I recommand to use a dot-notation in the properties file for variable naming like org.mycompany.ftp.targetName.host, org.mycompany.ftp.targetName.user, org.mycompany.ftp.targetName.port etc..
    org.mycompany.ws.myWebService.myEndPoint=http://org.myCompany/myWebService/myEndPoint, org.mycompany.ws.myWebService.myEndPoint.user, etc.

    If you use config.properties files for each projects, maybe you can merge them with your kettle.properties file on your DEV laptop, such that they are available in Spoon (unless you have to set them as environment variables each time you start Spoon).

    I will add:
    use a repository on every environnement (even file repository)! Because if you use only files (not file repository), but some point you say "Hey what about putting DI server on UAT / Production with a database repository!" you'll have to rewrite all your references for sub jobs / sub trans! More over, you can easily directly export / import complete or parts of the repository to deploy stuffs, and be sure you have everything in one xml file.

    Then, always reference sub jobs / sub trans with "Specify by name and directory" and set directory with the ${Internal.Job/Transformation.Repository.Directory} variable. With that you can then move/rename you main project directory without breaking stuffs.
    Last edited by Mathias.CH; 06-27-2016 at 11:42 AM.

  5. #5
    Join Date
    Apr 2013
    Posts
    7

    Default

    Thanks Mathias. You reply is really helpful.
    Parden me for few more question

    Then, always reference sub jobs / sub trans with "Specify by name and directory" and set directory with the ${Internal.Job/Transformation.Repository.Directory} variable. With that you can then move/rename you main project directory without breaking stuffs.
    ---Can I use
    ${Internal.Job/Transformation.Repository.Directory} in file based repo. As I read, it can be used when we don't use any repo and store our jobs/trans anywhere in file system.

    Also, Can you please help me deployment steps of Jobs/Trans in file-based repo on prod environment (AWS)

    Thanks in advance

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.