Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: String Cut step on double-byte strings

  1. #1
    Join Date
    Jul 2010
    Posts
    25

    Default String Cut step on double-byte strings

    Hello,

    I used the String Cut step successfully on regular western-char strings...but it does not seem to work on double-byte strings such as Chinese or Korean. Is my observation correct? Please help.

    Regards,
    Rasheedik

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    I don't think that assessment is correct.
    The "String cut" step works with standard Java Strings and those use 16-bit Unicode.
    For example, see the explanation for the length() operator:

    http://download.oracle.com/javase/1....g.html#length()

    The step simply works with these 16-bit Unicode characters in the string.

    That being said, I did notice that the code might not deal well with lazy conversion. Make sure to turn that off in the "* file input" steps.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.