Hitachi Vantara Pentaho Community Forums
Results 1 to 4 of 4

Thread: Pentaho Kettle (PDI) Split Names

  1. #1
    Join Date
    Nov 2008
    Posts
    12

    Default Pentaho Kettle (PDI) Split Names

    How do I split a fullname into first, middle and lastname?

    I tried following the Regex Evaluation example in given here but even using the exact sample data I couldn't split the string.

    Here is what am going for:

    Assume the input field FullName with the following 4 rows:

    Code:
    John Doe
    John Doe Smith
    John Doe Smith Jackson
    John
    The split should be by space character, and the output should be:
    For all the 4 rows, the FirstName should be John (obviously)

    But the LastName output should be:

    Code:
    Doe
    Smith
    Jackson
    <null>
    If there comes a requirement for MiddleName (not urgent but would be good for information purposes), the output would be:

    Code:
    <null>
    Doe
    Not Sure - may have to clarify with client if that comes up
    <null>
    There is a Split Fields transformation function but that only splits the fields by a given delimiter. That would have worked except for the fact that in some cases the data will have 3 names, or even four names. So there is no dynamic way of getting the last name if you don't know how many names are to be supplied.

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    So assuming you have ID, fullname (use a sequence if you don't have an id) you can use the "Split field to rows" step. Use a "Group By" (include all rows option!) step to add the word number and the total number of words in the full name.
    Then all you need to do is filter out the first and last word by saying nr=1 and nr=count. Finally de-normalize the data again to get field back onto one row into 2 fields (first and last name).
    HTH,
    Matt

  3. #3
    Join Date
    Nov 2008
    Posts
    12

    Default

    Thanks for the input.

    The Split Field to Rows is working. I don't see the "Group By" option though. I am using PDI CE 5.4

    The filter row also works partially, for firstname. Since I didnt get the count as explained above, I cant give the last name

  4. #4
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Here's one other way to go about it.
    Attached Files Attached Files
    So long, and thanks for all the fish.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.