Hitachi Vantara Pentaho Community Forums
Results 1 to 7 of 7

Thread: Versioning Kettle projects with Git (on GitHub) - advice?

  1. #1

    Question Versioning Kettle projects with Git (on GitHub) - advice?

    Hi!

    I am currently working on a project revolving around importing the Eurostat database (provided as SDMX files) into a local database (Postgres) using Kettle. When I am done with this step I plan to set up an MDX access to the data via Mondrian. The ultimate goal then is to examin the data using Weka and other non-Pentaho tools.

    Naturally I would like to version and organize this project and my experience tells me that Git with a GitHub repo will be a good choice. I have only mild experience in project organization with Git and I am also rather new to Kettle.

    So I would appreciate advice on how a good setup might look like. Maybe this is worth a discussion. Advice and personal experience is welcome!

    Thanks

    Raffael

  2. #2
    Join Date
    Dec 2009
    Posts
    609

    Default

    Hi Raffael,

    since you can use a "file-system repository" to store your KJB/KTR files in, these directories easily could be put into any kind of version-control system.
    So as first I approach I would recommend to try that.
    In other projects we have put our KJB/KTR stuff into SVN, works nice so far

    Cheers,

    Tom

  3. #3
    Join Date
    Mar 2013
    Posts
    24

    Default

    Raffael,

    This might be a useful thread for you to read. http://forums.pentaho.com/showthread...munity-Edition

  4. #4

    Default

    Hi Tom,

    thanks for your input.

    Is there a difference between using a "file system repository" and just simply saving jobs and trafos on your disk?
    And if yes - what is the difference and why is using a FSR more useful for versioning purposes than just organizing the files in folders.

    Thanks in advance!

    Raffael

  5. #5
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Without a repository you can store your workflows everywhere you want, with a repository you are confined to the configured folder subtree.
    Without a repository you only can refer to a transformation in a job by it's absolute pathname, with a repository you can use relative pathnames.
    With a repository you can easily transfer all your workflows to another filesystem without breaking references.
    With a repository version control must only deal with a single folder subtree, too.
    So long, and thanks for all the fish.

  6. #6
    Join Date
    Jan 2016
    Posts
    7

    Default

    Quote Originally Posted by marabu View Post
    Without a repository you can store your workflows everywhere you want, with a repository you are confined to the configured folder subtree.
    Without a repository you only can refer to a transformation in a job by it's absolute pathname, with a repository you can use relative pathnames.
    With a repository you can easily transfer all your workflows to another filesystem without breaking references.
    With a repository version control must only deal with a single folder subtree, too.
    Excellent and concise summary--thank you!

  7. #7
    Join Date
    Aug 2011
    Posts
    360

    Default

    Hi,

    I have a complementary question on peojects versionning:
    How do you deal with the database model change and static data change??
    In general, when developping ETL workflows, you deal with a database at some point.
    So how do you guya work in team with that? Do you have all a database server on your local PC, and you push
    SQL scripts to a test server to change the model?

    I mean, one usually has to have some real data to developpe some thing, and in an agile way, so how to organize work such that
    model change dont fucked up work of others?

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.