Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: Import text files with different layouts

  1. #1

    Default Import text files with different layouts

    Friends,


    I'm trying to import about 3000 text files and they have different layouts. There are no big differences. For purposes of processing, there are basic fields that are common to all the 3000 files. To give an example of how are the files, see the example below.

    "Arq1.txt"
    field_A
    field_B
    field_C
    field_D
    VALUE
    VALUE
    VALUE
    VALUE
    VALUE
    VALUE
    VALUE
    VALUE

    "Arq2.txt"
    field_A
    field_B
    field_E
    VALUE
    VALUE
    VALUE
    VALUE
    VALUE
    VALUE

    "Arq3.txt"
    field_A
    field_B
    field_G
    field_H
    VALUE
    VALUE
    VALUE
    VALUE
    VALUE
    VALUE
    VALUE
    VALUE



    Note that the three files have in common fields "field_A" and "field_B" which are actually those fields that need to complete the transformation.

    However, on the tab: Fields of step <Text file input> if I only enter the fields I need, there is a disorganization of the sequence of the fields and as a result the PDI read the wrong value to be in the wrong position. On the other hand, if I insert all fields, the kettle returns an error warning that certain field could not be found in the source file.

    Does anyone have any idea what I could do to read the 3000 files treating different layouts?


    Thanks a lot and greetings Brazilian.

  2. #2
    Join Date
    Apr 2008
    Posts
    1,771

    Default

    Hi.
    You should check how to use Metadata Injection step.
    On the wiki you should be able to find an example - sorry but don't have link to copy!
    -- Mick --

  3. #3

    Default

    Hi Mick,


    Before speaking of my doubt I would like to thank you for the statement of the solution to my problem.


    Well, I studied by step documentation ETL Metadata Injection, understand their functioning, but I'm still out a doubt. I think someone here in the forum can help me.


    Look, I have a thousand files to import, as I had said in the first post. From what I understand the ETL Metadata Injection step, in addition to inject metadata, in my particular case the interest is in the name of the fields of text files, I need to inform in advance or define what they are, right?


    So how I'll make it up to 1000 files? It is this step of the construction of a transformation process is that I do not understand very well.

    How I will inject in step of reading the CSV to transformation model, the layout of each of the 1000 files, considering that their position can vary within the text file.


    Could help me again, please!


    Thank you very much!

  4. #4

    Default

    Friends,


    Before anyone could help me, I was fortunate to find a Matt Caster text <http://www.ibridge.be/?p=273> that speaks exactly the solution I need.

    As part of the philosophy of a forum, I put here the solution of my problem to provide inspiration to others.



    Thank you all for your help.


    Brazilian greetings.
    Last edited by wisleyvelasco; 04-30-2016 at 09:00 PM.

  5. #5

    Default

    Friends,


    I'm always helped here in the Forum and many things I learned was searching here. So I'm coming back to this post to leave a reference text that helped me a lot on the central issue of this request for help.


    The text belongs to a Brazilian colleague who works with the Pentaho suite. I hope that this can help the community to further strengthen and for those who, like me, needed help.


    Link: http://blog.oncase.com.br/using-pdi-...ort-csv-files/


    Hugs to all and thank you!

    Quote Originally Posted by Mick_data View Post
    Hi.
    You should check how to use Metadata Injection step.
    On the wiki you should be able to find an example - sorry but don't have link to copy!

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.