Hitachi Vantara Pentaho Community Forums
Results 1 to 9 of 9

Thread: Issues with variable substrings

  1. #1

    Default Issues with variable substrings

    Hi all,

    Once again I need the communities' help regarding Kettle v4.1.

    I have in one of my tables a column which contents I want to transform into other columns.

    I shall give an example of this:

    • In table A I have a column, let's say, attributedescriptions with the following content:

      Attribute_1=Title{CR}
      Attribute_2=Company{CR}
      Attribute_3=Department{CR}


    Now, the number of attributes is variable, and these are separated with {CR}, and I want to transform these attributes to another table, into columns.

    That being said, the outcome of the transformation should be:

    • TABLE B:

      Column 1 - Column_Name = Title. Cell value: Dr.
      Column 2 - Column_Name = Company. Cell value: Pentaho
      Column 3 - Column_Name = Department. Cell value: Health


    How can I do this?

    Best regards,

    --jr, Portugla

  2. #2
    Join Date
    Aug 2008
    Posts
    563

    Default

    You can use the Split Fields step to do this.
    Best regards,
    Diethard
    ===============
    Visit my Pentaho blog which offers some tutorials mainly on Kettle, Report Designer and Mondrian
    ===============

  3. #3

    Default

    Quote Originally Posted by diddy View Post
    You can use the Split Fields step to do this.
    Hi Diddy,

    Thanks for your reply.

    I've tried that step. However, since the input is like this (see picture) I'm not able to perform the step as i wished to.

    http://img266.imageshack.us/img266/8522/21328429.png

  4. #4
    Join Date
    Aug 2008
    Posts
    563

    Default

    It should still work ... just make sure you mention the correct delimiter.
    If the delimiter is causing the problems, maybe you can replace it beforehand with something friendlier, and then mention this new delimiter in the split values step.
    What's your delimiter? {CR} for carriage return?
    Best regards,
    Diethard
    ===============
    Visit my Pentaho blog which offers some tutorials mainly on Kettle, Report Designer and Mondrian
    ===============

  5. #5
    Join Date
    Sep 2009
    Posts
    810

    Default

    Hi joaoromao,
    If you'd post a short demo file with a few records, I'd create a sample for you.

    I suppose it would be a variation on this theme: http://type-exit.org/adventures-with...lds-in-kettle/

    Cheers

    Slawo

  6. #6

    Default

    Quote Originally Posted by slawomir.chodnicki View Post
    Hi joaoromao,
    If you'd post a short demo file with a few records, I'd create a sample for you.

    I suppose it would be a variation on this theme: http://type-exit.org/adventures-with...lds-in-kettle/

    Cheers

    Slawo
    Hey there Slawo,

    I will take a look on that link.

    Nevertheless, let me post here the issue.

    In Table A, i have this column (see picture), in which I remind the number of attributes is variable.



    My output goal is to have, on another table (B) the following structure:

    Column_Name1: Título AND Column_Name2: Empresa.
    Row_Value: (this is from another query, don't worry)

    Is this complete enough for your understanding?

    If not, reply to this post please.

  7. #7
    Join Date
    Sep 2009
    Posts
    810

    Default

    Hi joaoromao,

    do I understand correctly, that the resulting row structure would have to be variable as well? It is one thing to extract the two attributes, but quite another to have a variable row structure resulting from that. At the moment it is not possible to have different columns on different rows in the row stream.

    You may generate SQL tables of different structure based on the attributes extracted from the attributes column, though.

    Regarding the sample data I was asking for a file (mysql dump of a short demo table maybe?) so I could handle the whitespace characters correctly (the newlines you're having can be any combination of \r and \n chars). The screenshot does not tell me how the whitespace is encoded, which is relevant, because we need to split on that

    Cheers

    Slawo

  8. #8

    Default

    Quote Originally Posted by slawomir.chodnicki View Post
    Hi joaoromao,

    do I understand correctly, that the resulting row structure would have to be variable as well? It is one thing to extract the two attributes, but quite another to have a variable row structure resulting from that. At the moment it is not possible to have different columns on different rows in the row stream.

    You may generate SQL tables of different structure based on the attributes extracted from the attributes column, though.

    Regarding the sample data I was asking for a file (mysql dump of a short demo table maybe?) so I could handle the whitespace characters correctly (the newlines you're having can be any combination of \r and \n chars). The screenshot does not tell me how the whitespace is encoded, which is relevant, because we need to split on that

    Cheers

    Slawo
    Hi Slawo,

    Thank you for your help, but i've managed to achieve a row with multiple attributes separated by comma (see images).

    http://img688.imageshack.us/img688/5250/att2x.png

    Thank you for your kindness anyway,

    Regards,

    --jr

  9. #9

    Default

    And yes, it was a \n and \r as you can see

    HTML Code:
    var desc = replace(attributedescriptions, "\\r","","\\n",",");

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.