Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: Lock algorithm in 3.2.x logging tables

  1. #1
    Benjamin Kallmann Guest

    Default Lock algorithm in 3.2.x logging tables

    Hi everybody,

    I'm working with the PDI 3.2.x API and I have some questions regarding the locking mechanism of the PDI log tables.

    I already started a discussion in the forum (http://forums.pentaho.org/showthread.php?t=76989) but got no response so far. Since this is more a developer question, maybe this is more suitable group to discuss this.

    I'm working on a small API to launch transformations and I'm getting problems with unique batch IDs. Here's my original question:

    **************************************************************************
    I'm using PDI 3.2.5 and I encountered that PDI somehow tries to lock the transformation logging table by inserting new record with the batch ID -1 (the same approach also exists for jobs).

    What's the idea behind this? It seems that the locking is only done at the beginning before creating a new logging entry and released immediately after the entry was created. Looks to me like PDI tries to avoid race conditions on the logging table if two or more transformation run in parallel. But this will only work if there is a unique constraint on the table? And if you put a constraint on the table you'll get a very user-unfriendly error message from the database that the constraint was violated...

    Furthermore, the unlocking (by deleting the record with batch ID -1) is done outside the finally block which means that the lock gets never removed if something goes wrong in between...

    Am I missing here something?
    **************************************************************************


    Any help regarding my questions is highly appreciated.


    Thanks in advance,
    Ben
    --
    GRATIS f

  2. #2
    Matt Casters Guest

    Default Re: Lock algorithm in 3.2.x logging tables

    A whole day without a reply, must be some kind of record :-)

    Anyway, I really wish we wouldn't have to jump through these hoops but unfortunately the locking mechanism varies from RDBMS to RDBMS to the point where it simply doesn't work on certain databases.
    We try to make sure we create a unique ID on as many databases as possible but doing even such a simple thing is far from easy to do.

    The case in point, on certain databases you only get a transaction going if you actually do something with a table and you an only lock the table if you insert something.

    I think the only option is that if you have a problem you mention the database used and then we'll see if and what is needed exactly.

    Matt
    --
    Matt Casters
    Pentaho - Chief Data Integration
    mcasters (AT) pentaho (DOT) org - http://www.pentaho.org

    Op woensdag 23 juni 2010 10:53:48 schreef Benjamin Kallmann:
    > Hi everybody,
    >
    > I'm working with the PDI 3.2.x API and I have some questions regarding the locking mechanism of the PDI log tables.
    >
    > I already started a discussion in the forum (http://forums.pentaho.org/showthread.php?t=76989) but got no response so far. Since this is more a developer question, maybe this is more suitable group to discuss this.
    >
    > I'm working on a small API to launch transformations and I'm getting problems with unique batch IDs. Here's my original question:
    >
    > **************************************************************************
    > I'm using PDI 3.2.5 and I encountered that PDI somehow tries to lock the transformation logging table by inserting new record with the batch ID -1 (the same approach also exists for jobs).
    >
    > What's the idea behind this? It seems that the locking is only done at the beginning before creating a new logging entry and released immediately after the entry was created. Looks to me like PDI tries to avoid race conditions on the logging table if two or more transformation run in parallel. But this will only work if there is a unique constraint on the table? And if you put a constraint on the table you'll get a very user-unfriendly error message from the database that the constraint was violated...
    >
    > Furthermore, the unlocking (by deleting the record with batch ID -1) is done outside the finally block which means that the lock gets never removed if something goes wrong in between...
    >
    > Am I missing here something?
    > **************************************************************************
    >
    >
    > Any help regarding my questions is highly appreciated.
    >
    >
    > Thanks in advance,
    > Ben
    >


    --
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    To unsubscribe from this group, send email to kettle-developers+unsubscribe (AT) googlegroups (DOT) com.
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en.

  3. #3
    Benjamin Kallmann Guest

    Default Re: Lock algorithm in 3.2.x logging tables

    Hi Matt,

    the support in forum and and response times are excellent, there's no need to complain about that. All I wanted to say was, that I put my question in the wrong forum/group...

    Thanks for clarifying the use of the -1 record. I experienced issues with Oracle when I put an unique index on the batch ID column. I cannot reproduce the issue every time, only very rarely. It looks like when something goes wrong during the initiation of the logging and an exception is raised, the -1 record is never ever removed and the next launch of any transformation will fail because of the validation of the unique index.

    I'm trying to find a reproducible use-case and will log a JIRA case afterwards.


    Thanks,
    Ben


    -------- Original-Nachricht --------
    > Datum: Wed, 23 Jun 2010 21:20:34 +0200
    > Von: Matt Casters <mcasters (AT) pentaho (DOT) org>
    > An: kettle-developers (AT) googlegroups (DOT) com
    > Betreff: Re: Lock algorithm in 3.2.x logging tables


    >
    > A whole day without a reply, must be some kind of record :-)
    >
    > Anyway, I really wish we wouldn't have to jump through these hoops but
    > unfortunately the locking mechanism varies from RDBMS to RDBMS to the point
    > where it simply doesn't work on certain databases.
    > We try to make sure we create a unique ID on as many databases as possible
    > but doing even such a simple thing is far from easy to do.
    >
    > The case in point, on certain databases you only get a transaction going
    > if you actually do something with a table and you an only lock the table if
    > you insert something.
    >
    > I think the only option is that if you have a problem you mention the
    > database used and then we'll see if and what is needed exactly.
    >
    > Matt
    > --
    > Matt Casters
    > Pentaho - Chief Data Integration
    > mcasters (AT) pentaho (DOT) org - http://www.pentaho.org
    >
    > Op woensdag 23 juni 2010 10:53:48 schreef Benjamin Kallmann:
    > > Hi everybody,
    > >
    > > I'm working with the PDI 3.2.x API and I have some questions regarding

    > the locking mechanism of the PDI log tables.
    > >
    > > I already started a discussion in the forum

    > (http://forums.pentaho.org/showthread.php?t=76989) but got no response so far. Since this is more a
    > developer question, maybe this is more suitable group to discuss this.
    > >
    > > I'm working on a small API to launch transformations and I'm getting

    > problems with unique batch IDs. Here's my original question:
    > >
    > >

    > **************************************************************************
    > > I'm using PDI 3.2.5 and I encountered that PDI somehow tries to lock the

    > transformation logging table by inserting new record with the batch ID -1
    > (the same approach also exists for jobs).
    > >
    > > What's the idea behind this? It seems that the locking is only done at

    > the beginning before creating a new logging entry and released immediately
    > after the entry was created. Looks to me like PDI tries to avoid race
    > conditions on the logging table if two or more transformation run in parallel.
    > But this will only work if there is a unique constraint on the table? And if
    > you put a constraint on the table you'll get a very user-unfriendly error
    > message from the database that the constraint was violated...
    > >
    > > Furthermore, the unlocking (by deleting the record with batch ID -1) is

    > done outside the finally block which means that the lock gets never removed
    > if something goes wrong in between...
    > >
    > > Am I missing here something?
    > >

    > **************************************************************************
    > >
    > >
    > > Any help regarding my questions is highly appreciated.
    > >
    > >
    > > Thanks in advance,
    > > Ben
    > >

    >
    > --
    > You received this message because you are subscribed to the Google Groups
    > "kettle-developers" group.
    > To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    > To unsubscribe from this group, send email to
    > kettle-developers+unsubscribe (AT) googlegroups (DOT) com.
    > For more options, visit this group at
    > http://groups.google.com/group/kettle-developers?hl=en.


    --
    GRATIS f

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.