Hitachi Vantara Pentaho Community Forums
Results 1 to 11 of 11

Thread: plz help:::::transformation from text to table

  1. #1
    Join Date
    Jul 2007
    Posts
    9

    Unhappy plz help:::::transformation from text to table

    am tryin to run transformation from text file to a table


    my in put data is like this

    Rec#1
    Emp_id=1001, Emp_name=satish,
    Dept=101
    Age=30
    Sex M


    Rec#2
    Emp_id=1002, Emp_name=ram,
    Dept=102
    Age=33
    Sex M


    Rec#3
    Emp_id=1003, Emp_name=rahim,
    Dept=103
    Age=31
    Sex M


    Rec#4
    Emp_id=1004, Emp_name=kishor,
    Dept=104
    Age=52
    Sex M


    Rec#5
    Emp_id=1005, Emp_name=kiran,
    Dept=105
    Age=24
    Sex F


    theres no unformity in distribution of data......plz help me out

  2. #2
    Join Date
    May 2006
    Posts
    4,882

    Default

    Start with a text input step, possibly follow it by a select values step, and then by an insert/update step.

    Or what problem do you have?

    Regards,
    Sven

  3. #3
    Join Date
    Jul 2007
    Posts
    9

    Unhappy

    i got an example from samples on a fixed length file......
    its working fine for me....
    but in the file which i got the fields are distributed horizontally.....
    i gave the file type as fixed....
    so tht i will be able to select my own fields......
    but its allowing me to divide that fields on vertical basis only

  4. #4
    Join Date
    May 2006
    Posts
    4,882

    Default

    well... horizontally you can't select as those are the rows.

    If you want to select rows you would have e.g. to read it in, process all lines and then use a select values to select only the rows you want.

    Regards,
    Sven

  5. #5
    Join Date
    Jul 2007
    Posts
    9

    Post

    is there any way to resolve this problem......using some scripting tool....any way using javascript???????????????

  6. #6
    Join Date
    May 2006
    Posts
    4,882

    Default

    Essentially it's the same... but my solution is easier if you have fixed width rows:

    Solution 1:
    - Read in the lines as 1 big string
    - Use Javascript to select the rows you want
    - Split them up further in fields: problem here is that no step yet exists to do this (in a fixed length way)

    Solution 2 (as in the previous post):
    - Read in all lines using text input using the proper fields (asuming you have fixed width in all lines)
    - Select values to only select the rows you want.

    Then again we may be talking about something completely different, maybe it's best for you to attach a small sample of the file you want to process

    Regards,
    Sven

  7. #7
    Join Date
    Jul 2007
    Posts
    9

    Red face

    ok,thanks alot....i will try out on the methods you gave me.....am attaching the text file i got with me......please check it,and plz let me know,if u got the result.......
    Attached Files Attached Files
    • File Type: txt 1.txt (318 Bytes, 12 views)

  8. #8
    Join Date
    May 2006
    Posts
    4,882

    Default

    IC ... forget the previous posts in this thread. The only way out of it which I see now is the pre-process with something else then Kettle. In perl it's about 40 lines of code maybe.

    You have to get all data of a single employee on a single row, only keeping the values (not the keywords), defaulting any missing values (if you would have them), and using separators between all of them.

    Regards,
    Sven
    Last edited by sboden; 07-27-2007 at 06:54 AM.

  9. #9
    Join Date
    Jul 2007
    Posts
    9

    Unhappy

    am not well versed in perl...if u can...plz can u give tht code

  10. #10
    Join Date
    May 2006
    Posts
    4,882

    Default

    I can't start making pre-process job for everyone... time is not on my side . Try for it yourself.. it's not that hard.

    Regards,
    Sven

  11. #11
    Join Date
    Jul 2007
    Posts
    247

    Default

    no perl needed actually, you could do it by using kettle only

    How to start:

    The main idea is to combine those five rows that represent one record to a single one. Fortunately, there's a parameter "number of wrapped lines" in the text file input step one could use. Setting this parameter to 5 will get you one neat row for every record (one field that contains the whole string), but the first one looks somehow broken... To fix this, insert two dummy lines at the beginning of the file or use the attached transform to handle this.

    Now we got one row for each record, but how to split it into separate fields? By looking at the data you will see that almost every "field" has its own name as a king of prefix (e.g. emp_id=), but not every field though. Use javascript to replace the "#", ", " and the " " characters from the string. Doing this will get you a string like col1=value1col2=value2col3=value3...

    To split the fields we need a separator. Use JavaScript again to add it to the string by replacing "col1=" with ",col1=" and so on.

    Afterwards, use the split field step to split the string into separate field, don't forget to use the "ID" parameter.

    Altogether, you just need to use 4 different steps to solve this

    See my two transformations for a detailed approach. Note: the column names are fixed in my transformations, so whenever the layout of your data changes (column rename etc.) you'll have to change the transformations as well...

    Regards,
    Ben


    P.S. The best approach of course is not to work with such *!%$ kind of data
    Attached Files Attached Files

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.