Hitachi Vantara Pentaho Community Forums
Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: Kitchen Hanging

  1. #1

    Default Kitchen Hanging

    I have a process I made in chef, where I download some files via ftp, then kick of a transformation.


    This transformation reads from a text file, then filters rows sending some to an insert/update step and some to a dummy step.



    I just ran my process through kitchen and i appear to have hung it. There was a sql exception (my fault which i will fix) that caused the insert/update step to end. My filter step appears to be waiting while putting some rows to that step that is already finished.






    Here is a thread dump of my hanging kitchen process. The "Filter Rows" Thread and the "Dummy" are still active, but the "Insert/Update" is nowhere to be found. I am guessing that if there is an sql exception then the insert/update finishes, leaving its queue of rows to fill up and pause previous steps. I would guess that the insert/update should either stop processing rows on the first exception, but still read them in or allow each row that comes in to execute no matter what.



    I would provide a process, but i am not sure how to do that in a way you could run since my insert/update depends on my database connection.



    Full thread dump Java HotSpot(TM) Server VM (1.5.0_05-b05 mixed mode):



    "Dummy (do nothing)" prio=1 tid=0x081e6c78 nid=0x5a6a sleepinghttp://0x65dbf000..0x65dc0030[img]/w...es/newpage.gif[/url]
    at java.lang.Thread.sleep(Native Method)
    at java.lang.Thread.sleep(Thread.java:276)
    at be.ibridge.kettle.trans.step.BaseStep.getRow(BaseStep.java:1009)
    - locked <0x7b6f1000> (a be.ibridge.kettle.trans.step.dummytrans.DummyTrans)
    at be.ibridge.kettle.trans.step.dummytrans.DummyTrans.processRow(DummyTrans.java:52)
    at be.ibridge.kettle.trans.step.dummytrans.DummyTrans.run(DummyTrans.java:87)



    "Filter rows" prio=1 tid=0x0862b038 nid=0x5a68 sleepinghttp://0x670fc000..0x670fd130&#91;img]/w...es/newpage.gif[/url]
    at java.lang.Thread.sleep(Native Method)
    at java.lang.Thread.sleep(Thread.java:276)
    at be.ibridge.kettle.trans.step.BaseStep.putRowTo(BaseStep.java:930)
    - locked <0x7b6f2000> (a be.ibridge.kettle.trans.step.filterrows.FilterRows)
    at be.ibridge.kettle.trans.step.BaseStep.putRowTo(BaseStep.java:890)
    - locked <0x7b6f2000> (a be.ibridge.kettle.trans.step.filterrows.FilterRows)
    at be.ibridge.kettle.trans.step.filterrows.FilterRows.processRow(FilterRows.java:94)
    at be.ibridge.kettle.trans.step.filterrows.FilterRows.run(FilterRows.java:137)



    "Low Memory Detector" daemon prio=1 tid=0x08177700 nid=0x59fa runnable http://0x00000000..0x00000000&#91;img]/w...es/newpage.gif[/url]



    "CompilerThread1" daemon prio=1 tid=0x081762e0 nid=0x59f9 waiting on condition http://0x00000000..0x6da103c8&#91;img]/w...es/newpage.gif[/url]



    "CompilerThread0" daemon prio=1 tid=0x08175248 nid=0x59f8 waiting on condition http://0x00000000..0x6da91048&#91;img]/w...es/newpage.gif[/url]



    "AdapterThread" daemon prio=1 tid=0x081740d0 nid=0x59f7 waiting on condition http://0x00000000..0x00000000&#91;img]/w...es/newpage.gif[/url]



    "Signal Dispatcher" daemon prio=1 tid=0x08173160 nid=0x59f6 waiting on condition http://0x00000000..0x00000000&#91;img]/w...es/newpage.gif[/url]



    "Surrogate Locker Thread (CMS)" daemon prio=1 tid=0x081723b8 nid=0x59f5 waiting on condition http://0x00000000..0x00000000&#91;img]/w...es/newpage.gif[/url]



    "Finalizer" daemon prio=1 tid=0x08168b80 nid=0x59f4 in Object.wait() http://0x6de95000..0x6de96030&#91;img]/w...es/newpage.gif[/url]
    at java.lang.Object.wait(Native Method)
    - waiting on <0x7cb00468> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
    - locked <0x7cb00468> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)



    "Reference Handler" daemon prio=1 tid=0x081674b8 nid=0x59f3 in Object.wait() http://0x6df16000..0x6df170b0&#91;img]/w...es/newpage.gif[/url]
    at java.lang.Object.wait(Native Method)
    - waiting on <0x7cb05c00> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:474)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
    - locked <0x7cb05c00> (a java.lang.ref.Reference$Lock)



    "main" prio=1 tid=0x0805e4c0 nid=0x59ec waiting on condition http://0xbfffc000..0xbfffc748&#91;img]/w...es/newpage.gif[/url]
    at java.lang.Thread.sleep(Native Method)
    at be.ibridge.kettle.job.entry.trans.JobEntryTrans.execute(JobEntryTrans.java:447)
    at be.ibridge.kettle.job.Job.execute(Job.java:274)
    at be.ibridge.kettle.job.Job.execute(Job.java:327)
    at be.ibridge.kettle.job.Job.execute(Job.java:327)
    at be.ibridge.kettle.job.Job.execute(Job.java:327)
    at be.ibridge.kettle.job.Job.execute(Job.java:202)
    at be.ibridge.kettle.kitchen.Kitchen.main(Kitchen.java:296)



    "VM Thread" prio=1 tid=0x08165040 nid=0x59f2 runnable



    "Gang worker#0 (Parallel GC Threads)" prio=1 tid=0x0806db70 nid=0x59ed runnable



    "Gang worker#1 (Parallel GC Threads)" prio=1 tid=0x0806e7b0 nid=0x59ee runnable



    "Gang worker#2 (Parallel GC Threads)" prio=1 tid=0x0806f3d8 nid=0x59ef runnable



    "Gang worker#3 (Parallel GC Threads)" prio=1 tid=0x08070000 nid=0x59f0 runnable



    "Concurrent Mark-Sweep GC Thread#0" prio=1 tid=0x080eb918 nid=0x59f1 runnable



    "VM Periodic Task Thread" prio=1 tid=0x08178ca8 nid=0x59fb waiting on condition

  2. #2

    Default RE: Kitchen Hanging

    Bug filed.


    http://javaforge.com/proj/tracker/itemDetails.do?task_id=2343&navigation=true

  3. #3
    Join Date
    Nov 1999
    Posts
    9,729

    Default RE: Kitchen Hanging

    I rejected the bug, we can&#39;t fix anything in the Java Virtual Machine.

    Thank you for understanding!

    Matt

  4. #4

    Default RE: Kitchen Hanging

    its not a jvm bug. those thread.sleeps are not a vm thing, they are called by your code.

    lines 928-931 in BaseStep.java

    the jvm is doing exactly what you want. the problem is that the consumer of the rows is stopped, so the outputrowset for my filter step never empties, so this keeps sleeping.

    sleeptime=transMeta.getSleepTimeFull();
    while(rs.isFull() && !stopped)
    {
    try{ if (sleeptime>0) sleep(0, sleeptime); else super.notifyAll(); }


    I would suggest making sure that the next steps are not stopped as well in that if check. if all consumers are stopped, the queue will never empty and that sleep will get called until the process is killed.

  5. #5
    Join Date
    Nov 1999
    Posts
    9,729

    Default RE: Kitchen Hanging

    OK, but I disagree on the suggestion.
    When an error occurs, Trans.waitUntilFinished() stops all other running processes.

    Matt

  6. #6

    Default RE: Kitchen Hanging

    Right, what i am saying is that not all steps complete. So the waitUnitlFinished() turns into waitTillTheEndOfTime().

    The problem is within the transform, consumers of rows go away so the hops fill up and the producers for the hop wait and wait and wait since they cant put any more rows on the hop until some are consumed.

    The chef job is running correctly. I would expect it to wait as is, the problem is more that not all transforms are finishing. I will try to create a process to show what I am talking about.

  7. #7
    Join Date
    Nov 1999
    Posts
    9,729

    Default RE: Kitchen Hanging

    That can only happen if a step fails for some reason to set stopped=true for some reason.
    That&#39;s the reason we put the try/catch blocks in the run() methods in the steps. Even if you have some one-off exception, it should stop the step.

    If a step is not stopped and doesn&#39;t consume any rows from the rowset, it SHOULD wait until the end of time.

    Matt

  8. #8

    Default RE: Kitchen Hanging

    I think I must be explaining myself pretty poorly here. Let me try again.

    I have 2 Steps, Step "P" produces rows, Step "C" consumes rows.

    The processes roll along for a while, P produces rows, C consumes them at a slower rate. At some point, the rowset becomes full so P has to sleep rather than just add to the rowset.

    The condition for the loop is that while the queue is full and P is not stopped, sleep, then check again.

    Now at some point C has an exception. For this example take the Insert/Update (though all steps seem to have similiar logic). There is an exception thrown in the processRow method that gets caught in the run method. the catch and finally mark step C as stopped, the run method ends and the thread that runs step C stops.

    Now P is still in its continuous loop of sleeping, nothing has notified it that there was a downstream problem with a consumer of the rows it is producing, the rowset betweend P and C is full and never empties, so P keeps sleeping.

    To me either C needs to notify P that it is down and remove its rowset from P as something that should be published to or P needs to check that the consumers of its rowsets are still up before sleeping. does that make sense?

  9. #9
    Join Date
    Nov 1999
    Posts
    9,729

    Default RE: Kitchen Hanging

    Yes, that&#39;s how I understood it.
    Now, the thing that happens is that waitUntilFinished() checks if there is a step is stopped.
    So in your case, when "C" stops, waitUntilFinished() "sees" this and also sets "P" as stopped, effectivily ending the sleep loop.

    All I&#39;m saying is that if you have step plugins, make sure to catch exceptions and set that "stopped" flag if anything goes wrong.

    If you would try to implement logic in a step that would check if the previous/next steps are stopped, I guess that would work also. (although performance wise I&#39;m not sure if it&#39;s the optimal solution)

    But I&#39;m pretty sure it&#39;s not the cause of our problem.

    Matt

  10. #10

    Default RE: Kitchen Hanging

    Ok, I am following what you are saying now. Took me a while. So correct me if I am wrong, when you are running in spoon there is a thread that runs (Launched in SpoonLog.java) that wakes up, checks to see if any steps are stopped with errors and if so kills all other steps.

    In chef, it looks like you are trying to do the same thing with the waitUntilFinished. It looks like that is not getting called and the part of the app that is stuck in an infinite loop is actually line 447 of JobEntryTrans.java. It checks to see if the transform is finished or if the parent is stopped, if not it sleeps and does it again. I have a break point set in waitUntilFinished and it is never getting hit. It looks like that method is only called after this loop I am stuck on.

    maybe line 447 needs to be changed to

    while (!trans.isFinished() && !parentJob.isStopped() && trans.getErrors()== 0)

    then line 453 should be changed to

    if (parentJob.isStopped() || trans.getErrors() != 0)

    Does that make sense?

    I made the change locally and it fixes my problem. Do you agree this change makes sense? If so I can check it in.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.