Hitachi Vantara Pentaho Community Forums
Results 1 to 9 of 9

Thread: Add new column based on the condition

  1. #1

    Default Add new column based on the condition

    Hello, can you please help me add a new column to the text output (.csv) based on the condition.

    Example, the similarity of the fields is >=0.8 then add new column called with value 1.

    I am combining 3 calculators results into one formula, can you please help me. I need to somehow include the name of the "step" and the name of the filed

    Name:  Capture.jpg
Views: 748
Size:  19.5 KB
    Thanks!
    Last edited by marijamilojevic87@yahoo.c; 09-28-2017 at 01:19 PM.

  2. #2

    Default How to get and use the name of the step in the formula or calculator

    Hello all,

    can you please help me finding a way to combine multiple steps into one formula?

    What I am trying to achieve is to calculate [JW L1 M2][FieldA]/3 +[JW M1 L1][FieldA]/3= new field

    Name:  Capture.jpg
Views: 637
Size:  19.5 KB

  3. #3
    Join Date
    May 2016
    Posts
    282

    Default

    So you have something like this:
    Dataset1
    col1 colDS1
    A 0.3
    B 0.9

    Dataset2
    col1 colDS2
    C 0.2
    D 0.85

    etc
    And you want to end up with a .csv like this?
    col1 colDS NewCol
    A 0.3
    B 0.9 colDS1
    C 0.2
    D 0.85 colDS2
    ...
    OS: Ubuntu 16.04 64 bits
    Java: Openjdk 1.8.0_131
    Pentaho 6.1 CE

  4. #4

    Default

    yes, exactly!

  5. #5
    Join Date
    Sep 2011
    Posts
    152

    Default

    i think you will have to use cartesian product step to get all the fields and then apply condition to get exact rows

  6. #6
    Join Date
    Sep 2011
    Posts
    152

    Default

    we can use dummy step to combine the fields and then formula to get the result of final fields.
    Please find attached sample.
    Attached Files Attached Files

  7. #7
    Join Date
    May 2016
    Posts
    282

    Default

    Hi @rajeshbcrec,
    How did you manage to get the Dummy step to not throw an error when you try to combine the two datasets in one stream? If I try to replicate it in a new step I get an error because ds1 column has a different name than ds2 column. If I just run your step without modifying it I get this:
    col1 ds1 ds2
    A 0.3 null
    B 0.9 null
    C 0.2 null
    D 0.85 null

    Anyway, that's not exactly what the OP is trying (I think, at least from my example), because that doesn't throw up a column telling if the data comes from ds1 column or ds2 column. You need a constant added with the name of the column:combine two fields new column origin.ktr

    Edit: I had to change the format of number columns because in Spanish we use "," as decimal separator, you'll have to put whatever separator works in your locale.
    OS: Ubuntu 16.04 64 bits
    Java: Openjdk 1.8.0_131
    Pentaho 6.1 CE

  8. #8
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Just pointing out that you you know the name of the steps...
    So instead of going "JW L1 M2" directly into the formula step, point it to a distinct "Add Constants" step to add the same column name after each JW step.
    Call this column "DS" with a type of string, length 8.

    Now you have data that is the same across all various streams, and they can be joined in any step type you want...

  9. #9
    Join Date
    Sep 2011
    Posts
    152

    Default

    Hi Ana,
    Ignore the dummy step error, and you can also use NVL function if you are not sure about which column will have data.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.