Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: csv file parsing and validating using pentaho kettle

  1. #1
    Join Date
    Nov 2013
    Posts
    6

    Default csv file parsing and validating using pentaho kettle

    Hi,

    I'm using Pentaho Data Integration (kettle) 5.0.1.

    What I'm looking for is :

    I have a data file (input file) say .csv file and it is given below :

    21,John,FL
    23,,MI
    2p,Taylor,FL
    25,Tony,,



    Also I have a text file say config.txt file where i'm defining schema of a data file

    id,integer
    name,string,NOT NULL
    location,string



    Here what I'm trying achieve is to create a job/transformation in kettle such a way that, when i read the data file it should take the schema from the config.txt file

    and validate the data based on datatype,field length and nullable values. If the invalid record is found in the data file, then that error record has to move to error

    file and the good validated record has to be dumped on HDFS.

    So in the above example expected result(good records) is :

    id,name,location
    21,John,FL
    25,Tony,,


    Error file records are
    id,name,location
    23,,MI --> 2nd field value cannot be NULL
    2p,Taylor,FL --> 1st field value should be integer type


    Here i do not want to create the schema on the fly. So please suggest me how to read the data file by referring config.txt for the schema and how to do the validation and move validated data on HDFS.

    So please let me know if the scenario is not clear.

    Thanks,
    Shree
    Last edited by shree; 04-16-2014 at 07:31 AM.

  2. #2
    Join Date
    Nov 2013
    Posts
    6

    Default

    Hi,

    Is there any updates on the below post?

    Thanks,
    Shree

  3. #3
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    ETL Metadata injection is the Kettle feature you will need to employ.
    So long, and thanks for all the fish.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.