Hitachi Vantara Pentaho Community Forums
Results 1 to 8 of 8

Thread: Carriage Returns and Data Scrubbing

  1. #1
    Join Date
    Jan 2014
    Posts
    23

    Default Carriage Returns and Data Scrubbing

    I receive a data set with a date field, this field can have data in several forms:
    • NULLs
    • Normal Dates
    • Dates with Carriage Returns and (sometimes) additional dates


    I am interested in using PDI to extract the field and strip the values of carriage returns. In the cases where multiple dates are in the field, I would like to keep the last one and discard the others.

    Is this transformation possible? If so, how?

    Cheers.

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Everything is possible - well, almost everything.

    Please, provide some test cases in a Data Grid for download.
    So long, and thanks for all the fish.

  3. #3
    Join Date
    Jan 2014
    Posts
    23

    Default

    Thanks, marabu. I keep getting IO errors (#2038) when trying to upload a simple .txt or .xls file.

    Here are some date examples which represent my issue. They can be copied and pasted as text into an excel file.

    12/5/2014 7:55
    12/5/2014 12:52
    12/5/2014 12:52




    12/5/2014 12:52




    "12/01/2014 08:32:27 PM
    12/03/2014 05:25:57 PM"
    "12/01/2014 08:32:52 PM
    12/06/2014 03:35:05 PM"
    "12/03/2014 07:39:01 AM
    "
    "12/03/2014 05:25:58 PM
    12/06/2014 04:28:57 AM"
    "11/28/2014 12:37:37 PM
    12/04/2014 10:42:19 AM
    12/04/2014 10:42:19 AM"
    "11/26/2014 03:59:14 PM
    12/01/2014 07:25:43 PM"



    Cheers.

  4. #4
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Here's something for you to analyze
    Attached Files Attached Files
    Last edited by marabu; 12-12-2014 at 01:46 PM. Reason: improved demo
    So long, and thanks for all the fish.

  5. #5
    Join Date
    Jan 2014
    Posts
    23

    Default

    I spent some time working through your provided demo. Thanks, by the way! It seems the process merely replicates the original values, but does not do any transforming. I am doing some further testing with the steps you suggested. Can you briefly explain what the demo is meant to accomplish?

  6. #6
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    The first three steps show how to pick the bottom value from a number of values joined by a line separator.
    The remaining steps show how to deal with the different input date formats.
    So long, and thanks for all the fish.

  7. #7
    Join Date
    Jan 2014
    Posts
    23

    Default

    The demo only had 3 steps.

  8. #8
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Initially, yes.
    I figured that you might love to see how different formats can be processed in the same transformation.
    So I updated the demo.
    So long, and thanks for all the fish.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.