Hitachi Vantara Pentaho Community Forums
Results 1 to 1 of 1

Thread: OSGI initialization hangs kitchen on startup

  1. #1
    Join Date
    Dec 2016
    Posts
    2

    Default OSGI initialization hangs kitchen on startup

    Greetings,

    We're having issues running Pentaho jobs in production. We recently deployed our first Pentaho job to production and ran it successfully using kitchen. A few days later we tried to run the same job and kitchen hung before even attempting to run the job. The thread dump and logs suggest that it's waiting indefinitely for an OSGI callback (a delayed service notifier) to release the OSGI initialization worker thread ("ExecutorUtil thread 2" below). We've not been able to reproduce this in our lower environments. The only difference between our lower environments and production is the hardware: production has a bit more memory and slightly faster CPUs (but same number of cores).

    To rule out our job we tried using other jobs including the sample jobs that ship with Pentaho and the result is always the same. Here's the relevant portion of the thread dump:
    Code:
    "ExecutorUtil thread 2" #20 daemon prio=5 os_prio=0 tid=0x00007f84f132c000 nid=0x19ec waiting on condition [0x00007f852452d000]
       java.lang.Thread.State: WAITING (parking)
            at sun.misc.Unsafe.park(Native Method)
            - parking to wait for  <0x0000000080c06030> (a java.util.concurrent.CountDownLatch$Sync)
            at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
            at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
            at org.pentaho.platform.servicecoordination.impl.BaseCountdownLatchLifecycleManager.notifyListenersAndWait(BaseCountdownLatchLifecycleManager.java:121)
            at org.pentaho.platform.servicecoordination.impl.BaseCountdownLatchLifecycleManager.setPhaseAndWait(BaseCountdownLatchLifecycleManager.java:78)
            - locked <0x0000000080c05ef8> (a org.pentaho.di.osgi.KettlePhaseLifecycleManager)
            at org.pentaho.di.osgi.KettleLifeCycleAdapter.onEnvironmentInit(KettleLifeCycleAdapter.java:20)
            at org.pentaho.di.core.lifecycle.KettleLifecycleSupport.onEnvironmentInit(KettleLifecycleSupport.java:116)
            at org.pentaho.di.core.lifecycle.KettleLifecycleSupport.onEnvironmentInit(KettleLifecycleSupport.java:107)
            at org.pentaho.di.core.KettleEnvironment.initLifecycleListeners(KettleEnvironment.java:157)
            at org.pentaho.di.core.KettleEnvironment.init(KettleEnvironment.java:129)
            at org.pentaho.di.core.KettleEnvironment.init(KettleEnvironment.java:75)
            at org.pentaho.di.kitchen.Kitchen$1$1.call(Kitchen.java:101)
            at org.pentaho.di.kitchen.Kitchen$1$1.call(Kitchen.java:95)
            at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
            at java.lang.Thread.run(Thread.java:745)
    We have found that removing kettle-lifecycle-listeners.xml and kettle-registry-extensions.xml in the classes directories prevents the issue and allows the job to run successfully. This seems acceptable as a short-term workaound as we're not yet using the big data plugins but wondering if this is a bug. But I can't explain why our job worked the first time we ran it and then after that it always hangs.

    I've attached a log with DEBUG level set. You can see that the last entry is "About to start waiting on delayed service notifiers". The event in the DelayedServiceNotifierListener on line 171 of KarafLifecycleListener is never accepted as based on the fact that the message "Done waiting on delayed service notifiers" is never logged.

    Version: pdi-ce-6.1.0.1-196
    OS: Linux
    Java: 1.8.0_91

    Any suggestions for other things to look at? Should I file a bug report?

    Thanks,
    Derek
    Attached Files Attached Files

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.