Hitachi Vantara Pentaho Community Forums
Results 1 to 9 of 9

Thread: Organize a messy text file to export to DB

  1. #1

    Default Organize a messy text file to export to DB

    Hi, I'm starting to work with PDI Kettle and I'm new in this forum. I need some help.

    I have a huge problem with a text file exported from a ERP System, because the data from this file there isn't organization and there aren't separations.

    Here is a sample:

    Code:
    NAME OF COMPANY                                                                                                        PAG:    1
    ssw0216                                          DEMONSTRATIVO DE FRETES DE COLETA/ENTREGA REALIZADA                     12/01/16  07:56
    UNIDADE: NAME OF UNIT                                                                           PERIODO: 01/01/16 A 08/01/16
    VEICULO: XYZ1234     TIPO: TRUCK     CONTRATADO..: NAME OF UNIT
    ------------+---------+--------------------+--------------------+---+------+-----+------------+-----------+-----------+--------+---------
    CTRC         NF        REMETENTE            DESTINATARIO         SET PESO   QTVOL     VAL MERC   VLR FRETE     % FRETE ROMANEIO CTRB
    ------------+---------+--------------------+--------------------+---+------+-----+------------+-----------+-----------+--------+---------
    DIA 07/01/16  COLETA
    BMV/020520-6    238072 NAME OF CLIENT 1 03  4.155,2     8    16.000,00    1.031,79        0,00      0-0 
    SUB-TOTAL  COLETA        1 COLETAS/   1 CLIENTES                     4.155,2     8    16.000,00    1.031,79
    
    TOTAL DO DIA -     1 EVENTOS/  1 CLIENTES                            4.155,2     8    16.000,00    1.031,79
      DIARIA                                           239,73
      KM RODADOS       0 (R$    1,131/KM)                0,00
      EVENTOS/CLIENTES                                   0,00
      PESO                                               0,00
      % SOBRE FRETE                                      0,00          REMUNERACAO DE      239,73
      ICMS                                               0,00
    
    DIA 08/01/16  COLETA
    BMV/020521-4     97537 "NAME OF CLIENT 2" 016   711,9    30    16.068,22      301,42        0,00      0-0 
    BMV/020522-2     97540 "NAME OF CLIENT 2" 016    45,6     6       194,57       84,53        0,00      0-0 
    .
    .
    .
    
    TOTAL DO DIA -     7 EVENTOS/  1 CLIENTES                            4.326,5    63    60.920,13    1.687,94
      DIARIA                                           239,73
      KM RODADOS       0 (R$    1,131/KM)                0,00
      EVENTOS/CLIENTES                                   0,00
      PESO                                               0,00
      % SOBRE FRETE                                      0,00          REMUNERACAO DE      532,73
      ICMS                                             131,71
    I need the information in bold, and the transformation will stay like this:

    Code:
    VEICULO;TIPO;DIA;TIPO_OP;CTRC;NF;REMETENTE;...;KM_RODADOS;REMUNERACAO;
    XYZ1234;TRUCK;07/01/16;COLETA;BMV/020520-6;238072 ;NAME OF CLIENT 1;0; 239,73;
    XYZ1234;TRUCK;08/01/16;COLETA;BMV/020521-4;97537;NAME OF CLIENT 2;0; 532,73;
    XYZ1234;TRUCK;08/01/16;COLETA;BMV/020522-2;97540 ;NAME OF CLIENT 2;0; 532,73;
    But, first of all, i need to split the separate information someway because I just could extract the information above, missing others datas:

    Code:
    CTRC;NF;REMETENTE;...
    BMV/020520-6;238072;NAME OF CLIENT 1;...
    BMV/020521-4;97537;NAME OF CLIENT 2;...
    BMV/020522-2;97540 ;NAME OF CLIENT 2;...
    Can someone help me?

    ps: Sorry for my english, because I'm brazilian and I can't speak english very well...

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    This looks like a report not meant for further processing.
    You'll have to identify the parts of that report, e.g. group-header, detail-row, a.s.o.
    Later you can use step Strings-Cut to extract field values.
    Can you provide a sample report for download - not only part of the first page?
    Or is it a single page everytime?
    So long, and thanks for all the fish.

  3. #3

    Default

    Yes but just in this report I have all necessary informations.

    Are there tools in spool that I can identify or split the parts like group-header, detail-row, among others?

    Here is the document, and there are many pages. And I will work with 8 reports to affiliates differents.

    Thanks for the support.
    Attached Files Attached Files

  4. #4
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Here's an unfinished demo for you to analyze.
    Attached Files Attached Files
    So long, and thanks for all the fish.

  5. #5

    Default

    Thanks buddy... I will analyze and study this demo.

    I couldn't answer before because I was sick.

    I'll soon be here again.

  6. #6

    Default

    Quote Originally Posted by marabu View Post
    Here's an unfinished demo for you to analyze.
    So, I analized this demo and did some changes, but I'm confused with a situation.

    The issue is, I have some rows and the _kmr field of these rows are null, and a row with filled _kmr field , and the project continues with some rows with null _kmr field and a row with filled _kmr field.

    I'm trying to fill these empty _kmr fields with the next non null _kmr value, but I can't do this any way.

    e.g.

    The project is like this:

    rownum _khdr _kmr kmi km_rodado
    1 0 <null> 0 <null>
    2 0 <null> 1 <null>
    3 0 <null> 2 <null>
    4 0 <null> 3 <null>
    5 1 114 0 <null>
    6 0 <null> 0 <null>
    7 0 <null> 1 <null>
    8 0 <null> 2 <null>
    9 0 <null> 3 <null>
    10 1 130 0 <null>


    And I expect something like this:

    rownum _khdr _kmr kmi km_rodado
    1 0 <null> 0 114
    2 0 <null> 1 114
    3 0 <null> 2 114
    4 0 <null> 3 114
    5 1 114 0 114
    6 0 <null> 0 130
    7 0 <null> 1 130
    8 0 <null> 2 130
    9 0 <null> 3 130
    10 1 130 0 130


    May you help me??
    Attached Files Attached Files
    Last edited by thiagofred; 01-26-2016 at 09:25 AM.

  7. #7

    Default Fill with the next non null value

    Hello guys.

    The issue is this, I have some rows and the _kmr field of these rows are null, and a row with filled _kmr field , and the project continues with some rows with null _kmr field and a row with filled _kmr field.

    I'm trying to fill these empty _kmr fields with the next non null _kmr value, but I can't do this any way.

    e.g.

    The project is like this:

    rownum _khdr _kmr kmi km_rodado
    1 0 <null> 0 <null>
    2 0 <null> 1 <null>
    3 0 <null> 2 <null>
    4 0 <null> 3 <null>
    5 1 114 0 <null>
    6 0 <null> 0 <null>
    7 0 <null> 1 <null>
    8 0 <null> 2 <null>
    9 0 <null> 3 <null>
    10 1 130 0 <null>


    And I expect something like this:

    rownum _khdr _kmr kmi km_rodado
    1 0 <null> 0 114
    2 0 <null> 1 114
    3 0 <null> 2 114
    4 0 <null> 3 114
    5 1 114 0 114
    6 0 <null> 0 130
    7 0 <null> 1 130
    8 0 <null> 2 130
    9 0 <null> 3 130
    10 1 130 0 130


    I could to use the Group By, if the _kmr value was on the beginning, but in my case is the opposite.

    What steps can I use to do this?

    Thanks.

  8. #8
    Join Date
    Apr 2008
    Posts
    1,771

    Default

    I could to use the Group By, if the _kmr value was on the beginning, but in my case is the opposite.
    What about sorting your data upside down and then do a group by?
    -- Mick --

  9. #9

    Default

    Quote Originally Posted by Mick_data View Post
    What about sorting your data upside down and then do a group by?
    Mick, thank you so much...

    So easy, but I couldn't think this way...

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.