Hitachi Vantara Pentaho Community Forums
Results 1 to 8 of 8

Thread: Generate Random Values - UUID conversion error

  1. #1

    Default Generate Random Values - UUID conversion error

    I am using Generate Random values step to generate UUID. Every time a pass this generated value to any other step downstream it converts into a string (I have tried with 'Insert/Update' and Text File output string). Running into conversion errors, how do I retain the data type of UUID? Or is this a known defect in PDI?
    Last edited by smahankali; 05-21-2014 at 01:58 PM.

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    It doesn't convert to a string, a UUID is a string.

  3. #3

    Default

    Thanks Matt. But the column im performing a lookup on (Postgres) is of type uuid. Pentaho does not like and keeps throwing the following error:

    ERROR: column "id" is of type uuid but expression is of type character varying
    Hint: You will need to rewrite or cast the expression.

  4. #4

    Default

    I was able to fix this by setting stringtype = unspecified option in Database settings.

  5. #5

    Default

    I think it's a bit sad that there's no explanation or help provided here. I have the same problem and looking at this, I need to "fix" it changing the connection parameters. That seems odd.

    Isn't Pentaho capable to work with UUID's? "Generate random value" can produce UUID's so I would expect them to be accepted by Postgres. The DB itself can generate UUID's but loading them over to Pentaho and then inserting them doesn't seem to work either.

    Here's what I tried:
    1) getting the UUID via Postgres' "uuid_generate_v4()" function in a SELECT-statement ("Table input" step):
    Code:
    SELECT uuid_generate_v4() my_new_id,
           <other columns>
    FROM   my_table
    2) letting Pentaho generate the UUID in "Generate random value", type "UUID4"

    Both with the same result. Any solution to this?

  6. #6
    Join Date
    Aug 2006
    Posts
    1

    Smile PostgreSQL & UUIDs

    I stumbled across this thread and I have a fair amount of experience designing database schemas that are UUID keyed (PostgreSQL specifically).

    A couple basics. It would be more correct to say that a UUID is a 128 bit value that can be represented as a string (the standard string format of which is established in RFC-4122). As one would expect, PostgreSQL is NOT saving the string representation of the UUID value but the 128 bits of the UUID. Whether Pentaho Data Integration internally uses the string representation or the Java java.util.UUID type is something I don't know... Matt Caster's comment would seem to suggest that it's handling it as a string. Either way, to my understanding JDBC is where we hit the first problem because I'm pretty sure UUID is not a regularly supported type. So if Pentaho/the JDBC driver present this ostensibly UUID value to the database as a String (using the JDBC setString() method), you'll get a type mis-match because a UUID is NOT a string (text, character varying, etc).

    The suggestion to set the database connection options with stringtype = unspecified is the answer that will make thing like PDI Table Output to work; since you can't get in and make it an explicit cast it's the only way to do this with resorting to one of the scripting steps. The reason is found from: https://jdbc.postgresql.org/document...d/connect.html The operative part of that document is:
    ========
    stringtype = String

    Specify the type to use when binding PreparedStatement parameters set via setString(). If stringtype is set to VARCHAR (the default), such parameters will be sent to the server as varchar parameters. If stringtype is set to unspecified, parameters will be sent to the server as untyped values, and the server will attempt to infer an appropriate type. This is useful if you have an existing application that uses setString() to set parameters that are actually some other type, such as integers, and you are unable to change the application to use an appropriate method such as setInt().
    ========

    In this case it would be more appropriate for the UUID (for the PostgreSQL case) to prepare the query with the setObject call... but this allows us to bypass that and let the server make a best guess. I just tested and it got me around the same issue the original poster described.

    What I'm not sure is if I'm introducing possible issues by doing since I'm effectively weakening the type checking. I guess we'll see. :-)

  7. #7
    Join Date
    Jan 2013
    Posts
    5

    Default stringtype=unspecified

    Can someone please help where to find database setting in kettle tool.

    Quote Originally Posted by smahankali View Post
    I was able to fix this by setting stringtype = unspecified option in Database settings.

  8. #8
    Join Date
    Nov 2009
    Posts
    687

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.