Hitachi Vantara Pentaho Community Forums
Results 1 to 12 of 12

Thread: Trim Step in PDI (Kettle)

  1. #1
    Join Date
    Oct 2011
    Posts
    17

    Default Trim Step in PDI (Kettle)

    All,

    I've having some issues in trimming string fields onto the database.

    Database (SQl Server 2008) datatype is varchar(255). So when I'm filling a field with 'F', I'd like the database to contain only 'F' and not 'F ' (an F with 254 spaces).

    How to do this (I'm using PDI 4.2.0)?
    I've tried the 'Trim Strings' from the KFF project, but no success.
    This Step looked promissing, because it can perform a Trim (on both side) for all String Type fields, but unfortunately, with an incorrect result.
    The 'Trim Cut' step is not working for PDI 4.2.0.

    Any help is appreciated.

    Michiel

  2. #2
    Join Date
    Sep 2011
    Posts
    190

    Default

    Hi Michiel,

    Have you tried the 'String operations' step? It has a trim function (l,r,both) that works for me!

    gr. Michel ;-)

  3. #3
    Join Date
    Oct 2011
    Posts
    17

    Default

    This 'String Operation'with the trim functionality is not working for me, either.

    The data is still in the database with an 'rigth padding'of speces until 255, since the column is defined as varchar(255).

    I've also tested other 'String Operations', like UPPER Case (to see if the String Operation is doing anything at all), and this is working as expected. But still using the entire field-length with spaces.

    So if anybody can advise me what to do, it is appreciated....

  4. #4
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    KFF has a trim step that ought to do the trick: http://code.google.com/p/kettle-franchise/

  5. #5
    Join Date
    Oct 2011
    Posts
    17

    Default

    Unfortenately, this is not working either.

    I've attached the KTR file, inwhich nothing seems to be wrong.
    In the file input I've changed the integere fields into String fields, because the Phone for instance needs to contain the '00' or "+32" in the text.

    The behavior it shows is that all string fields are actually righpadded with spaces, instead of trimmed with spaces.

    What is wrong?
    Attached Files Attached Files

  6. #6
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    It would be nice if we could have a look at file "ExpertBE.csv", even a few lines. Otherwise it's impossible to test.
    Also: make sure the data type IS indeed varchar and not char, otherwise the database will simply put the spaces right back.

  7. #7
    Join Date
    Oct 2011
    Posts
    17

    Default

    I've attached a small set of the Input File (only in txt, becasue I was unable to upload CSV files)

    In the database, datatype is varchar.
    If I change manually a value, data is stored without the trailing spaces.
    Attached Files Attached Files

  8. #8
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    There are no trailing spaces in that file so why would you even want to trim the data?
    If you see trailing spaces in the database it's because the database has added them.
    Check the ANSI_PADDING setting in the database.

    http://msdn.microsoft.com/en-us/libr...v=sql.80).aspx

  9. #9
    Join Date
    Oct 2011
    Posts
    17

    Default

    ANSI Padding is OFF, when I manually chage a field the trailing spaces are trimmed off. Also a proof that ANSI Padding is off.

    In the entire file (csv) contain more then the small set of data. Because it is CSV the length of th ecolumns may vary. In my DWH I'd liek all striung fields to be varchar(255) and be able to select * from table where field ='F' without having to set all the spaces in the tooling. Just an example.

    When importing a DataType from a file PDI will go through all fields (I've set the parameter to 0) to determine the maximum length.
    Hence the file contains a name value = 'margot wilms', and this is in the database with 'margot wilms '.

    The KTR contains only the Steps which have identiefied the problem at hand. The actual KTR contains multiple Steps and other handling. But becasue writing to the database is having an issue, I've created a KTR with the actual cause/problem.

  10. #10
    Join Date
    Sep 2011
    Posts
    190

    Default

    Could this be an issue with the JDBC driver for MS SQL?
    Can you see what SQL statements are being generated in the MS SQL profiler?

    gr. Michel

  11. #11
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Whatever it is, Kettle is not adding spaces just to annoy you so keep looking for an answer in the JDBC driver (you're not using the MS JDBC driver are you?) or the database.

  12. #12
    Join Date
    Oct 2011
    Posts
    17

    Default

    I've added a screen shot from the SQL Server Profiler, but also here it looks like abacadabra to me.
    Attached Images Attached Images  

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.