Hitachi Vantara Pentaho Community Forums
Results 1 to 10 of 10

Thread: Problem to use database lookup step before an insert/update step

  1. #1

    Default Problem to use database lookup step before an insert/update step

    i've been experiencing a problem to use a lookup step before an insert/update step.
    I have rows with the following fields: id; name; log. and I use the field name to search for it in the database, and i use the database lookup for it. If it exists, the lookup step returns to the record's id. Otherwise, i change the flow to an add sequence step to generate an id and then i insert the row in database. But imagine this situation: a row doesn't exist in the database, then the lookup step doesn't find it and the row must to go to the insert/update step. But before the step inserts the row (maybe by a thread problem), other row (equal to the first one, with the same name) comes to the database lookup step and it does that just like in the first time (change the flow to an add sequence step to generate an id and later insert the row in the database), but the insert/update step doesn't insert the same row twice, 'cause i use the field name as key.
    Now i need to insert the row into the child table, but i have a big problem: it exists two rows in the flow with diferent ids and in the moment i attempt to insert the row in the child table i experience a problem of parent key not found, cause tthe lookup step shouldn't deal with the second row as if it doesn't exist in the database, 'cause it is equal to first, and it should return to the first row's id generated.

    Matt advised me to me to use combination lookup insert/update step, 'cause i always insert into the master table. he is right, i always insert, but it doesn't mean i will insert all rows. Then i cannot use this step 'cause i generate a field log and it cannot be used as a field.

    Does anybody know how to help me?

    regards

  2. #2
    Join Date
    May 2006
    Posts
    4,882

    Default

    2 possible solutions I see:
    - Use unique connection ... this will make it work but it will be a bit slower
    - split your transformation in 2: 1 to make the id's, another one to fill the information.

    Regards,
    Sven

  3. #3

    Default

    2 possible solutions I see:
    - Use unique connection ... this will make it work but it will be a bit slower

    I'm using kettle 2.5.2 and i don't know it. is there this option in this kettle's version? I don't know how it works too.

    - split your transformation in 2: 1 to make the id's, another one to fill the information.

    well, i'm reading the data from an excel sheet. i think i could fill the parent table first with one transformation and fill the child table with other, but it will be very slow.

    Regards,
    Diego

  4. #4
    Join Date
    May 2006
    Posts
    4,882

    Default

    Well 2.5.2 is almost in palliative care (read: "please upgrade") ... unique connections started in v3 I think. Unique connections make all steps use 1 connection, in most database this has the side effect that all steps can see each other's changes.

    For the split of transformation: choose ... working and slow, or not possible... If you're reading directly from an excel sheet in PDI you can only have 65K rows input... peanuts... what's the difference between 30 seconds of processing or 60 seconds of processing.

    Regards,
    Sven

  5. #5

    Default

    I cannot upgrade 'cause our client uses java 1.4

    i think i would need to read the sheet twice, 'cause i do more things in this transformation, i insert into others tables...

    i'm using this transformation on a project involving a big client and we do more complex things that torn the process slow, i would need to do another database lookup to get the id's... then i wouldn't like do it

    is there another solution?

    I think that something like a blocking step that pass all rows would be perfect. I would put it after i've just looked for in the database with database lookup step.

  6. #6
    Join Date
    May 2006
    Posts
    4,882

    Default

    JDK 1.4 is almost also in palliative care (end of 2008 I think).

    And a blocking step passing all rows exists... just not in 2.5.2 AFAIK.

    Regards,
    Sven

  7. #7

    Default

    you are right, the blocking step in kettle 2.5.2 doesn't pass all rows... =(
    do you know how to do a plugin that be able to do it? is this hard?

    well....is there another solution for my problem?

    thank you very much sboden =)
    Last edited by diegomrsantos; 08-31-2008 at 06:12 PM.

  8. #8
    Join Date
    May 2006
    Posts
    4,882

    Default

    No other solution that I can think up now... but it's getting late here

    What does work is split in 2... it would not be slower than a blocking step passing rows (if you can't get all rows in memory).

    Regards.
    Sven

  9. #9

    Default

    I've found the use unique conections option in kettle 2.5.2. This feature is found in the "Transformation Settings" dialog (Misc tab).
    But it didn't resolve my problem


    See more about this option in http://wiki.pentaho.com/display/COM/...ing+check-list

  10. #10

    Default

    I use a Database Lookup step to search for a record in the database; if it does exist, there is no problem and the record's ID is returned. Otherwise, I change the flow to a Add Sequecence step and an ID is generated. After that, I use a field of the input row as a key to look up in an Insert/Update step. The problem, as i said, is that a row(equal to the previous) comes to the Database Lookup step before the first row has been inserted, and this row (the second one) is not inserted, but I have two rows with differet IDs. When I attempt to insert the second row (containing an ID that doesn't exist) into a child table, I experience a parent key error. To solve this problem, you can add another Database Lookup step to look up the rows in the database and set the correct ID jusr right before you insert into the child table.


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.