Hitachi Vantara Pentaho Community Forums
Results 1 to 6 of 6

Thread: Replace database connections on import with the new 4.2 import.sh

  1. #1
    Join Date
    Oct 2011
    Posts
    6

    Default Replace database connections on import with the new 4.2 import.sh

    Hi

    I have transformations and jobs which I like to move from one file repository to another, and I want them to use the database settings of the destination repository. In Spoon this is possible by unchecking "replace existing connections", copying the files to the new repo and then opening all of them. Is there a way to do this with the import utility without manual intervention?

    I'd think that this is a very common use case to transform files between Dev/Test/Production/etc. environments.

    Thanks
    Jo

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    You should obviously test this but the file repository creates .kdb files that overwrite whatever is stored in the transformations and jobs.
    So simply copy the transformations and jobs over I would say.

  3. #3
    Join Date
    Oct 2011
    Posts
    6

    Default

    Quote Originally Posted by MattCasters View Post
    You should obviously test this but the file repository creates .kdb files that overwrite whatever is stored in the transformations and jobs.
    So simply copy the transformations and jobs over I would say.
    Sorry but I don't think that works - or I'm doing it wrong:

    Assuming two repositories test1 and test2:

    Code:
    rc@ds9000:/tmp $ cat ~/.kettle/repositories.xml | egrep 'test[12]'
        <name>test1</name>
        <description>test1</description>
        <base_directory>/tmp/test1</base_directory>
        <name>test2</name>
        <description>test2</description>
        <base_directory>/tmp/test2</base_directory>
    Both define a connection of the same name, "mysql_db"
    Code:
    rc@ds9000:/tmp $ cat test1/mysql_db.kdb | head -9
      <connection>
        <name>mysql_db</name>
        <server>la2dbhost</server>
        <type>MYSQL</type>
        <access>Native</access>
        <database>test1</database>
        <port>3306</port>
        <username>root</username>
        <password>Encrypted 2be98afc86aa7f2e4cb17ab64c993bddf</password>
    rc@ds9000:/tmp $
    rc@ds9000:/tmp $ cat test2/mysql_db.kdb | head -9
      <connection>
        <name>mysql_db</name>
        <server>la2dbhost</server>
        <type>MYSQL</type>
        <access>Native</access>
        <database>test2</database>
        <port>3306</port>
        <username>root</username>
        <password>Encrypted 2be98afc86aa7f2e4cb17ab64c993bddf</password>
    Now I do a simple transformation which dumps schema() to /tmp/connection.txt:
    Code:
    rc@ds9000:/tmp $ cat test1/trans1.ktr | egrep '(SELECT|mysql_db|database)'
        <name>mysql_db</name>
        <database>test1</database>
        <connection>mysql_db</connection>
        <sql>SELECT user() as username, schema() as dbname</sql>
    As you propose, I just copy it to test2:

    Code:
    rc@ds9000:/tmp $ cp test1/trans1.ktr test2
    But the expected result is only obtained in repository test1, in test2 the connection stored in the .ktr is taken
    and mysql_db.kdb is not considered:

    Code:
    rc@ds9000:/tmp $ /Applications/Kettle/pan.sh -rep test1 -trans trans1
    INFO  04-11 10:32:32,360 - Pan - Start of run.
    INFO  04-11 10:32:32,870 - trans1 - Step Table input.0 ended successfully, processed 1 lines. ( - lines/s)
    ...
    INFO  04-11 10:32:32,870 - trans1 - Step Text file output.0 ended successfully, processed 1 lines. ( - lines/s)
    rc@ds9000:/tmp $ cat /tmp/connection.txt
    username;dbname
    root@192.168.236.1;test1
    rc@ds9000:/tmp $ /Applications/Kettle/pan.sh -rep test2 -trans trans1
    INFO  04-11 10:32:44,462 - Pan - Start of run.
    INFO  04-11 10:32:44,657 - RepositoriesMeta - Reading repositories XML file: /Users/rc/.kettle/repositories.xml
    ...
    INFO  04-11 10:32:44,960 - trans1 - Step Text file output.0 ended successfully, processed 1 lines. ( - lines/s)
    rc@ds9000:/tmp $ cat /tmp/connection.txt
    username;dbname
    root@192.168.236.1;test1
    Is this how you meant it should work? It would really make sense, but it doesn't seem to be the case.
    This is with 4.2.

    Thanks for your time.
    Last edited by xx2700; 11-04-2011 at 06:01 AM.

  4. #4
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    I only heard it from someone that was using the file repository. Personally I would never hard code values in any database connection so I haven't tried myself.
    Why don't you simply set the connection properties as variables. That way you don't care about the values in the connections since there are none.

  5. #5
    Join Date
    Oct 2011
    Posts
    6

    Default

    Ok variables might be an option. Or maybe I could use the database repository, is the handling there different? I.e., are the connections from the repository preferred when you import a job/transformation into it?

  6. #6
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    In the database repository the database connections are references. This has the added advantage that you can rename connections for example.
    But even then I would use variables.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.