PDA

View Full Version : Encrypted Variables in PDI



ngoodman
08-19-2009, 06:30 AM
Every once in a while, I get to sound like a royal arse in front of a customer by saying something “I know” to be true about Pentaho that isn’t. Usually, this is a REALLY good thing because it’s usually some limitation, or Gotcha that existed in the product that has magically disappeared with the latest release. The danger of open source is that these things can change underneath you quickly, without any official fan fare and leave you looking like a total dolt at a customer site. Bad for consultants like me who are constantly having to keep up with extraordinarily fast product development. Good for customers because they get extraordinarily fast product development.

One of these experiences, which I was absolutely THRILLED to look like a dolt about, was

“If you use variables for database connection information, the password will be clear text in kettle.properties.”

A huge issue for many security conscious institutions. Customers were faced with a choice: use variables which centrally manages the connection information to a database (good thing) but then the password is clear text (bad thing). No longer!

Our good friend quietly committed this little gem (http://jira.pentaho.com/browse/PDI-665) nearly 18 months ago. It’s been in the product since 3.0.2! It allows encrypted variables to be decrypted in the password field for database connections.

Let’s test it out… our goal here is to make sure we can get a string “Encrypted jasiodfjasodifjaosdifjaodfj” which is a simple encrypted version of the password to be set as a regular ole variable but then be used as the “password” of a database connection.

We have a transformation that will set the variables, and then we’ll use that variable in the next transformation.

http://www.nicholasgoodman.com/bt/blog/wp-content/uploads/2009/08/moz-screenshot-6.png

The first one sets the variable ${ENCRYPTED_PASSWORD} from a text file. This string would be “lifted” from a .ktr after having been saved that represents the encrypted password.

http://www.nicholasgoodman.com/bt/blog/wp-content/uploads/2009/08/moz-screenshot-7.png

Then we use it in the next transformation and select from a database, and outputs the list of tables in the database to a text file.
http://www.nicholasgoodman.com/bt/blog/wp-content/uploads/2009/08/moz-screenshot-8.png

Output - works like a charm!

Customers can now have the best of both worlds. Centralize their variables for host/user/password using variables (including, kettle.properties) and keep those passwords away from casual hackers. I say casual because PDI is open source so in order for someone to decrypted a password they only need know Java, and know where to find PDI SVN. http://www.nicholasgoodman.com/bt/blog/wp-includes/images/smilies/icon_smile.gif

As always, example attached: encrypted_variables.zip (http://forums.pentaho.org/entry_images/encrypted_variables.zip)


http://img.zemanta.com/pixy.gif?x-id=5fe94334-9672-8c15-9353-9a240cc55d43


More... (http://www.nicholasgoodman.com/bt/blog/2009/08/18/encrypted-variables-in-pdi/)