Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: Combination lookup/update

  1. #1

    Default Combination lookup/update

    I am confused - I read the documentation of this step and I _assume_ it is the correct one to accomplish this:
    1) Look up in target table key (in my case customer_id, value_date)
    2) If not exists insert it with all dimensional information (about 50 columns or so) and create surrogate key
    3) If exists update dimensions if there is any change

    OK i missed that I was supposed to do an update after combination lookup/update step

    I see that surrogate is generated and fields updated in subsequent update step BUT I was forced to use Blocking step before update since I got lookup failure error.

    Is this a bug?
    Last edited by JoyDivision; 06-28-2013 at 08:49 AM. Reason: more info

  2. #2
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Combination Lookup/Update is for *UNVERSIONED* data.
    It allows you to convert unchanging multiple column data to one value (Think Foreign Key)
    http://wiki.pentaho.com/display/EAI/...+lookup-update

    Dimension Lookup / Update is for CHANGING data
    It allows you to convert time based variable data to a lookup value
    http://wiki.pentaho.com/display/EAI/...+Lookup-Update

    It's not clear from your description, which one you need, though I think you actually want Dimension Lookup / Update.
    I have used Dimension Lookup / Update in the past to track status values on Customer Cases across time.
    **THIS IS A SIGNATURE - IT GETS POSTED ON (ALMOST) EVERY POST**
    I'm no expert.
    Take my comments at your own risk.

    PDI user since PDI 3.1
    PDI on Windows 7 & Linux

    Please keep in mind (and this may not apply to this thread):
    No forum member is going to do your work for you. We will help you sort out how to do a specific part of the work, as best we can, in the timelines that our work will allow us.
    Signature Updated: 2014-06-30

  3. #3

    Default

    I am pretty sure i need Combination Lookup/Update since I intend to create snapshots of customer data depending if anything relevant happened in that date for particular customer. So I don't need any versions.

    My problem is why I am forced to use Blocking step - because If I do not I end up getting lookup failure error.

    I took this pattern from PDI 4 Cookbook by Pulvirenti and Roldan btw.

  4. #4
    Join Date
    Nov 2008
    Posts
    777

    Question

    Quote Originally Posted by JoyDivision View Post
    I am pretty sure i need Combination Lookup/Update since I intend to create snapshots of customer data depending if anything relevant happened in that date for particular customer. So I don't need any versions.

    My problem is why I am forced to use Blocking step - because If I do not I end up getting lookup failure error.
    I'm pretty sure you need Dimension Lookup/Update if you have 50 or so columns of dimensional information. Note that using versions is optional with that step. Combination Lookup/Update is typically used to create a "junk" dimension, e.g., a collection of flags and/or indicators that do not individually belong in their own "normal" dimension. See this post http://queforum.com/data-warehouse/3...e-example.html for an explanation and example.

    Also, I've done a lot of dimensional modeling, both with slowly changing "normal" dimensions and "junk" dimensions, and have never needed a blocking step. However, if you are using an Update after your Combination Lookup/Update you probably will. One option might be to set the "Commit size" to 1 in Combination Lookup/Update. That *may* flush the row to the database before Update tries to find it. Using Dimension Lookup/Update avoids this problem.
    Last edited by darrell.nelson; 07-02-2013 at 12:41 PM.
    pdi-ce-4.4.0-stable
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

  5. #5
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Quote Originally Posted by JoyDivision View Post
    since I intend to create snapshots of customer data depending if anything relevant happened in that date for particular customer.
    This piece alone tells me that you should be using the Dimension lookup/update step rather than the combination lookup/update.
    **THIS IS A SIGNATURE - IT GETS POSTED ON (ALMOST) EVERY POST**
    I'm no expert.
    Take my comments at your own risk.

    PDI user since PDI 3.1
    PDI on Windows 7 & Linux

    Please keep in mind (and this may not apply to this thread):
    No forum member is going to do your work for you. We will help you sort out how to do a specific part of the work, as best we can, in the timelines that our work will allow us.
    Signature Updated: 2014-06-30

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.