Hitachi Vantara Pentaho Community Forums
Results 1 to 10 of 10

Thread: insert/update step does not see trailing spaces

  1. #1
    Join Date
    Jan 2006
    Posts
    313

    Default insert/update step does not see trailing spaces

    Setup: Kettle 2.5.1
    Database : Oracle

    I read the MAINCOLOR field from table product for all records where MAINCOLOR <> trim(MAINCOLOR) In my case tha values read are of the form "001 "
    I change the value of MAINCOLOR to "001" (without training spaces)
    I do the insert/update step. This step does not see a difference between "001" and "001 "

    I guess this is a Kettle problem!

    The error log is attached


    As i am confronted a lot with this kind of problems having the possibility to specify left/right trim in the "database value lookup component" and "Stream lookup" would be a significant functional enhancement for me.
    Attached Files Attached Files

  2. #2
    Join Date
    May 2006
    Posts
    4,882

    Default

    Make a small example with only row generators and stream lookup and show us your problem

    Regards,
    Sven

  3. #3
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Gunther, Sven,

    There is no trimming being done in I/U and the lookup itself is being performed by the database.
    However, I think that in certain cases the database does automatically perform a trim.
    What is the Oracle data type of MAINCOLOR anyway?

    Vriendelijke groeten,

    Matt

  4. #4
    Join Date
    May 2006
    Posts
    4,882

    Default

    I'm willing to bet varchar2 ... that's why I asked for an example with only row generators and stream lookup.

    Regards.
    Sven

  5. #5
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    I guess we should just driver over to Gunther's place and look for ourselves.
    Now that drinking Hoegaarden is Politically Correct again...

  6. #6
    Join Date
    Jan 2006
    Posts
    313

    Default

    Matt, Sven,

    I have beer in the fridge, so if you guys want to solve this problem at my place, feel free to do so.


    I dont know if i understood your question for the example correctly, therefor i have created 2 examples.

    Create table product1.ktr, creates 1 record in table "product1", the field MAINCOLOR has trailing spaces
    Show problem.ktr tries tu update the MAINCOLOR field in the table, but does not see that tha value of MAINCOLOR = "001" without trailing spaces.
    I have tried this with Oracle and MySQL, both behave the same. The Oracle field type is VARCHAR2

    The streamlookup.ktr tries to show you my enhancement request .



    Regards,
    Attached Files Attached Files

  7. #7
    Join Date
    Nov 1999
    Posts
    9,729

    Default Database issue

    Code:
    mysql> create table gunther(a varchar(100));
    Query OK, 0 rows affected (0.10 sec)
    
    mysql> insert into gunther values('100 '), ('101     ');
    Query OK, 2 rows affected (0.10 sec)
    Records: 2  Duplicates: 0  Warnings: 0
    
    mysql> select concat(a, ']') from gunther;
    +----------------+
    | concat(a, ']') |
    +----------------+
    | 100 ]          |
    | 101     ]      |
    +----------------+
    2 rows in set (0.01 sec)
    
    mysql> select * from gunther where a='101'
        -> ;
    +----------+
    | a        |
    +----------+
    | 101      |
    +----------+
    1 row in set (0.00 sec)
    This presents an interesting problem. Since varchar or varchar2 compare with disregard for trailing spaces, you need to go to the CHAR data type.
    However, that data type typically pads to the specified length.

    At least in MySQL this ANSI behavior is disabled by default. (option: padCharsWithSpace=false).

    In Oracle, you might have luck with the "fixedString" option:

    If the value of this property is "true", JDBC will use FIXED CHAR semantic when setObject is called with a String argument. By default JDBC uses VARCHAR semantics. The difference is in blank padding. With the default there is no blank padding so, for example, 'a' does not equal 'a ' in a CHAR(4). If true these two will be equal.
    HTH,
    Matt

  8. #8
    Join Date
    Jan 2006
    Posts
    313

    Default

    Matt,

    For mysql setting padCharsWithSpace=true does solve the problem for Kettle, but it does not allow you to clean up your database. If kettle writes a line with "001 " it stay's in the database as "001 ". It is only the JDBC driver that pad's the extra characters. Another program with different padCharsWithSpace will suffer from the same problem.To fix this problem without an extra "trim" step could be to have the lookup step trim the result (The trim is controlled by a parameter).

  9. #9
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    To fix this problem without an extra "trim" step could be to have the lookup step trim the result (The trim is controlled by a parameter).
    No really, I'm not going to sprinkle trim options all over the database operators :-)

    Instead of messing about with these spaces, wouldn't it be far easier and cleaner to just store the number of trailing spaces in a separate column? Or you can use another flag to identify the records. This seems an easier workaround than the rather elaborate scheming we're doing above :-)

    HTH,

    Matt

  10. #10
    Join Date
    Jan 2006
    Posts
    313

    Default trim step

    Matt,

    If someone would write a trim step, would this be included in the standard distribution?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.