Hitachi Vantara Pentaho Community Forums
Results 1 to 12 of 12

Thread: Database lookup/Filter rows

  1. #1
    Join Date
    Nov 2013
    Posts
    24

    Default COlumnar lookup

    I have a table with some variable IDs and labels corresponding to it.

    My .csv input comes in with some customer names and labels corresponding to it. I need to look up the label in the first table, find the corresponding variable_id and then populate the second table accordingly. Is there an efficient way to do this without writing custom SQL scripts?
    Last edited by odomaeb; 12-18-2013 at 04:28 PM.

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Matt sums it up nicely: Transformation step optimization
    So long, and thanks for all the fish.

  3. #3
    Join Date
    Nov 2013
    Posts
    24

    Default

    Table 1:
    variable_id
    label

    Table 2:
    person_id
    variable_id
    var_value

    Input:
    customer name label 1 label 2... label n
    Andy .....
    Bob....

    When I get the input record, I need to find the variable_id corresponding to label 1, and populate the value in table 2 accordingly.

    A bit different than the row sorting. How do I do this efficiently?

  4. #4
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Quote Originally Posted by odomaeb View Post
    A bit different than the row sorting.
    The answer is still in Matt's post.
    Sorted rows are a requirement, not a solution.
    Attached Files Attached Files
    So long, and thanks for all the fish.

  5. #5
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Based on this thread and http://forums.pentaho.com/showthread...ultiple-tables

    I get the feeling you aren't telling us the whole story...

    The best way to build transformations is to think about what steps you would take with each row of data.
    1) Read it from the Text File
    2) Look up the Customer_ID based on their name
    3) Break the row into additional rows for each label
    4) Look up the Label_ID based on the Label Name
    5) Output the Customer_ID, and the Label_ID, and the value into a table
    **THIS IS A SIGNATURE - IT GETS POSTED ON (ALMOST) EVERY POST**
    I'm no expert.
    Take my comments at your own risk.

    PDI user since PDI 3.1
    PDI on Windows 7 & Linux

    Please keep in mind (and this may not apply to this thread):
    No forum member is going to do your work for you. We will help you sort out how to do a specific part of the work, as best we can, in the timelines that our work will allow us.
    Signature Updated: 2014-06-30

  6. #6
    Join Date
    Nov 2013
    Posts
    24

    Default

    Sorry I didnt explain it well.

    Right now, I have a .csv input-> database table 1 lookup->filter rows-> table 3 output for populating person_id.
    .csv input-> database table 2 lookup->filter rows-> table 3 output for populating var_id.

    But my table 3 looks like this at the end of the above transformations.

    person_id var_id value
    1 0
    2 0
    3 0
    0 1
    0 2
    0 3
    But I want it to look like this:
    person_id var_id value
    1 1
    2 2
    3 3

  7. #7
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    And as mentioned on the other thread, you need to go:
    csv input -> database table 1 lookup ->database table 2 lookup -> table 3 output
    **THIS IS A SIGNATURE - IT GETS POSTED ON (ALMOST) EVERY POST**
    I'm no expert.
    Take my comments at your own risk.

    PDI user since PDI 3.1
    PDI on Windows 7 & Linux

    Please keep in mind (and this may not apply to this thread):
    No forum member is going to do your work for you. We will help you sort out how to do a specific part of the work, as best we can, in the timelines that our work will allow us.
    Signature Updated: 2014-06-30

  8. #8
    Join Date
    Nov 2013
    Posts
    24

    Default

    Thanks so much!

    That works if I run my transformation once. But if I run it a second time, it gives me an error " Duplicate entry XXX for PRIMARY KEY"

  9. #9
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Ok, so what's your key?
    **THIS IS A SIGNATURE - IT GETS POSTED ON (ALMOST) EVERY POST**
    I'm no expert.
    Take my comments at your own risk.

    PDI user since PDI 3.1
    PDI on Windows 7 & Linux

    Please keep in mind (and this may not apply to this thread):
    No forum member is going to do your work for you. We will help you sort out how to do a specific part of the work, as best we can, in the timelines that our work will allow us.
    Signature Updated: 2014-06-30

  10. #10
    Join Date
    Nov 2013
    Posts
    24

    Default

    Both person_id and var_id are primary keys.

  11. #11
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Ok,

    So if you are sending the same data through your transform (ie. same person name and same label name ) then you should expect the get key collisions.

    Assume for a moment that you have:
    Person Table:
    person_id | person name
    1 | odomaeb
    2 | gutlez
    3 | marabu

    Label Table:
    var_id | label
    1 | label1
    2 | label2
    3 | label3

    Incoming source:
    "odomaeb","label1", 1,"label3",3
    "gutlez","label2",2,"label3",6

    Then the first time you will get data in the table:
    person_id | var_id | value
    1 | 1 | 1
    1 | 3 | 3
    2 | 2 | 2
    2 | 3 | 6

    But when you try to run it again, you'll try to insert 1 | 1 | 1 again, resulting in a key violation.

    Did you want to use an Insert / Update instead?
    **THIS IS A SIGNATURE - IT GETS POSTED ON (ALMOST) EVERY POST**
    I'm no expert.
    Take my comments at your own risk.

    PDI user since PDI 3.1
    PDI on Windows 7 & Linux

    Please keep in mind (and this may not apply to this thread):
    No forum member is going to do your work for you. We will help you sort out how to do a specific part of the work, as best we can, in the timelines that our work will allow us.
    Signature Updated: 2014-06-30

  12. #12
    Join Date
    Nov 2013
    Posts
    24

    Default

    Thanks gutlez! The insert/update proved to be the solution.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.