US and Worldwide: +1 (866) 660-7555
Results 1 to 10 of 10

Thread: PDI 4.1.0 Stable - Not Releasing MSSQL Locks

  1. #1

    Default PDI 4.1.0 Stable - Not Releasing MSSQL Locks

    Hi

    We are currently using PDI to capture data changes in one system and send them to another.

    The system it is getting data from is a MSSQL 2005 Instance and this is where the locks are occurring.

    We use a MYSQL Repository to host the transformations. The jobs seem to run fine but then hold onto the locks on MSSQL which takes the user application down.

    Is there anything I can do to prevent this. It is causing massive disruption.
    I'm not sure where to look to debug this.

    Thanks

    Bill

  2. #2
    Join Date
    Nov 2011
    Posts
    10

    Default

    Did you get a resolution on this issue? We're running PDI 4.2.1-stable with MySQL for both repository and data source and are also seeing spoon not releasing its locks.

  3. #3
    Join Date
    Nov 1999
    Posts
    9,534

    Default

    It depends very much on what is meant by locking. If you have a long running query in a transformation you automatically create a read-lock on a table.
    Depending on the situation this could cause problems when there are a lot of transactions taking place on it.
    It gets even worse if you are updating tables.
    Other than that, the repository is only locked when it is updated.
    Matt Casters, Chief Data Integration
    Pentaho, Open Source Business Intelligence
    http://www.pentaho.org -- mcasters@pentaho.org

    Author of the book Pentaho Kettle Solutions by Wiley. Also available as e-Book and on the Kindle reading applications (iPhone, iPad, Android, Kindle devices, ...)

    Join us on IRC server Freenode.net, channel ##pentaho

  4. #4

    Default

    Our main problem was with the live application still inputting rows into tables that we were capturing changes on.

    We never resolved this totally but improved it drastically by adding indexes to the fields that we were searching on within the transformations.

    I.e. we store primary ID's and a timestamp for the rows that have been changed.
    Which we then query on a schedule to bring back changes made to the database.

    After putting indexes against the timestamp column it improved our transformations speed and therefore reduced the locks that are occuring.

  5. #5
    Join Date
    Nov 1999
    Posts
    9,534

    Default

    I have some experience with these things. Kettle actually was first deployed against a 24/7 system.
    The trick IMO is to chop up the work in small pieces.

    Kettle has implemented various things from that period, more to the point there a re few ways to do micro-batching.
    If you set up a logging table for your transformation you can obtain a start and end date for your batch. (Get System Info step)
    That date range can be too large for certain online-system because it keeps a read lock.

    So I created a way to set a maximum time period in seconds in the transformation settings dialog, the "Dates" tab.
    The "Max. date difference (seconds)" option allows you to make the time period smaller specifically to lower the load on source systems.

    So if you have 24h to process at night you for example set a value of 3600 seconds for the max date difference.
    That way only one hour is processed. If that takes for example 5 minutes to process you run every 10 minutes and after a few hours you're back up to date.
    Even better, simply always run every 5 minutes and process 15 minutes worth of data and get a very stable operation throughout the day, except for the peak hours.

    The trick is I think to get a predictable load on both the data warehouse and the source systems.

    Anyway, fond memories and all that.

    Matt
    Matt Casters, Chief Data Integration
    Pentaho, Open Source Business Intelligence
    http://www.pentaho.org -- mcasters@pentaho.org

    Author of the book Pentaho Kettle Solutions by Wiley. Also available as e-Book and on the Kindle reading applications (iPhone, iPad, Android, Kindle devices, ...)

    Join us on IRC server Freenode.net, channel ##pentaho

  6. #6
    Join Date
    Nov 2011
    Posts
    10

    Default

    The problem we're seeing is that when 2-3 developers are all using spoon against the same repository and one saves a transformation, then the others are locked out. This is part of the output from SHOW PROCESSLIST

    651, spoon, 172.17.2.154:2526, spoon_repo, Sleep, 402, ,
    685, spoon, 10.20.1.122:22828, spoon_repo, Query, 402, Waiting for table metadata lock, LOCK TABLES R_CLUSTER WRITE, R_CLUSTER_SLAVE WRITE, R_CONDITION WRITE, R_DATABASE WRITE, R_DATABASE_
    693, spoon, 10.3.1.192:57477, spoon_repo, Execute, 198, Waiting for table metadata lock, SELECT VALUE_STR, VALUE_NUM FROM R_JOB_ATTRIBUTE WHERE ID_JOB = 60 AND CODE = 'LOG_SIZE_LIMIT' AND
    696, spoon, 10.3.1.192:45355, spoon_repo, Execute, 39, Waiting for table metadata lock, SELECT ID_JOB FROM R_JOB WHERE NAME = 'job_inf1123_bosci_visitsovertime' AND ID_DIRECTORY = 15

    The problem is not completely reproducible but occurs frequently. The only solution is to kill connection 651.

    --
    PDI 4.2.0-STABLE
    MySQL 5.5.8
    Spoon on Windows 7 Enterprise
    MySQL on openSUSE 11.3 (x86_64)
    Last edited by doneal24; 11-09-2011 at 11:16 AM.

  7. #7
    Join Date
    Nov 1999
    Posts
    9,534

    Default

    Spoon simply opens a connection, doesn't really lock anything anything and unlocks after saving.
    I wonder if this is another side-effect of the new 5.5 JDBC driver. Could you try downgrading to 3.1 (found in PDI 4.1 or below) and see if you still have the same issue?

    By the way, 2 totally different issues are mixed in this thread.
    Matt Casters, Chief Data Integration
    Pentaho, Open Source Business Intelligence
    http://www.pentaho.org -- mcasters@pentaho.org

    Author of the book Pentaho Kettle Solutions by Wiley. Also available as e-Book and on the Kindle reading applications (iPhone, iPad, Android, Kindle devices, ...)

    Join us on IRC server Freenode.net, channel ##pentaho

  8. #8
    Join Date
    Oct 2011
    Posts
    3

    Default

    was this resolved? I am getting PDI repository on hold due to 'Waiting for table metadata lock'

    the issue happens when a developer runs a transformation and has a error - then updates the transformation to correct the error and then saves it, then everyone is locked out. This is output from SHOW PROCESSLIST

    mysql> show processlist;
    +----+------+--------------------------------------+--------------------+---------+------+---------------------------------+------------------------------------------------------------------------------------------------------+
    | Id | User | Host | db | Command | Time | State | Info |
    +----+------+--------------------------------------+--------------------+---------+------+---------------------------------+------------------------------------------------------------------------------------------------------+
    | 19 | root | b30dsk1mfly.hostname.com:2882 | p1_srvr_repo | Sleep | 1089 | | NULL |
    | 20 | root | localhost | information_schema | Query | 0 | NULL | show processlist |
    | 21 | root | ip addr | p1_srvr_repo | Query | 133 | Waiting for table metadata lock | LOCK TABLES R_CLUSTER WRITE, R_CLUSTER_SLAVE WRITE, R_CONDITION WRITE, R_DATABASE WRITE, R_DATABASE_ |
    +----+------+--------------------------------------+--------------------+---------+------+---------------------------------+------------------------------------------------------------------------------------------------------+
    3 rows in set (0.00 sec)


    The repository frees up when a 'kill 19;' on the cmd in sleep mode.


    As suggested at the end of Pentaho Community Forums > Pentaho Users > Pentaho Data Integration [Kettle] > PDI 4.1.0 Stable - Not Releasing MSSQL Locks
    http://forums.pentaho.com/archive/in.../t-83498.html?

    I changed the mysql driver to 3.1.4 and same problem - tried these drivers and same problem with each one -

    /home/pentaho/mysql-connector-java-5.1.17.jar
    /home/pentaho/mysql-connector-java-5.1.18.jar
    /home/pentaho/mysql-connector-java-3.1.14

    the systems in use are:
    Red Hat Enterprise Linux Server release 5.7 (Tikanga)
    pdi-ce-4.2.1-stable.tar.gz
    mysql 5.5.2
    using my-medium.cnf as my.cnf in /etc

    Did I miss something or should I back down to mysql 5.1 for the repository. I have been using kettle and mysql for over 2 years - nothing like this has ever happened.

  9. #9
    Join Date
    Oct 2011
    Posts
    3

    Default

    I went back to MySQL Server version: 5.1.61 and the issue went away - all is right with the world. I guess I will wait for the next stable release of Mysql and maybe try it again.

  10. #10

    Default

    I have the same metadata table locking issue when using MySql 5.5.16 as the repository:

    1) Create a new job in the DB repository.
    2) Double-click on Start step and set it to repeat daily and save.
    3) From Spoon, run a job remotely on carte and then
    4) From Spoon, try saving the job and notice that Spoon hangs.

    Spoon will hang until the DB connection from carte to the repository is killed.

    It is clear that Carte is holding a transaction open after step 3 if you run the following on MySql :
    "show engine innodb status"

    How unfortunate if the only solution is to downgrade MySql!
    Last edited by goodiegoodman; 05-22-2012 at 08:05 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •