识别 WebSphere 中的线程死锁

识别 WebSphere 中的线程死锁

在 WebSphere 8.5.5.13 中,我遇到了一些内存不足错误,并且数据库连接达到最大值。在我看来,这是由于线程不足造成的(我有一些进程尝试执行某些操作,超时时间为 10 秒,而其他任务通常需要约 200 毫秒,但实际上需要约 10200 毫秒)。但我认为最后一个甚至可能是死锁。我有大约 100 个线程像这样等待

3XMTHREADINFO      "WorkManager.DefaultWorkManager : 648" J9VMThread:0x000000000F2AA300, omrthread_t:0x00007FE38D060D78, java/lang/Thread:0x000000018ACD99E8, state:B, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x68C86, isDaemon:true)
3XMTHREADINFO1            (native thread ID:0xF8DE, native priority:0x5, native policy:UNKNOWN, vmstate:B, vm thread flags:0x00000201)
3XMTHREADINFO2            (native stack address range from:0x00007FE09C92F000, to:0x00007FE09C96F000, size:0x40000)
3XMCPUTIME               CPU usage total: 2.131995383 secs, current category="Application"
3XMTHREADBLOCK     Blocked on: com/ibm/ws/util/ThreadPool@0x000000011CD4B888 Owned by: "WorkManager.DefaultWorkManager : 689" (J9VMThread:0x00000000011B3000, java/lang/Thread:0x00000001B148B9A8)
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at com/ibm/ws/util/ThreadPool.getTask(ThreadPool.java:1083(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/util/ThreadPool$Worker.run(ThreadPool.java:1916(Compiled Code))

WorkManager.DefaultWorkManager : 689 的堆栈如下所示

3XMTHREADINFO      "WorkManager.DefaultWorkManager : 689" J9VMThread:0x00000000011B3000, omrthread_t:0x00007FE1A41A70D0, java/lang/Thread:0x00000001B148B9A8, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x68CCD, isDaemon:true)
3XMTHREADINFO1            (native thread ID:0x11410, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00001001)
3XMTHREADINFO2            (native stack address range from:0x00007FE1EFF3E000, to:0x00007FE1EFF7E000, size:0x40000)
3XMCPUTIME               CPU usage total: 1.663139688 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at java/lang/ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:502(Compiled Code))
4XESTACKTRACE                at java/lang/ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:311(Compiled Code))
4XESTACKTRACE                at java/lang/ThreadLocal.setInitialValue(ThreadLocal.java:197(Compiled Code))
4XESTACKTRACE                at java/lang/ThreadLocal.get(ThreadLocal.java:183(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/util/objectpool/TwoTierObjectPool.purgeThreadLocal(TwoTierObjectPool.java:264(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/buffermgmt/impl/WsByteBufferPool.purgeThreadLocal(WsByteBufferPool.java:173(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/buffermgmt/impl/WsByteBufferPoolManagerImpl.purgeThreadLocals(WsByteBufferPoolManagerImpl.java:1169(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/runtime/component/WSBBPoolListener.threadDestroyed(WSBBPoolListener.java:62(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/runtime/component/ThreadPoolMgrImpl.threadDestroyed(ThreadPoolMgrImpl.java:459(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/util/ThreadPool.fireThreadDestroyed(ThreadPool.java:1593(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/util/ThreadPool.workerDone(ThreadPool.java:1005(Compiled Code))
5XESTACKTRACE                   (entered lock: com/ibm/ws/util/ThreadPool@0x000000011CD4B888, entry count: 1)
4XESTACKTRACE                at com/ibm/ws/util/ThreadPool$Worker.run(ThreadPool.java:1929(Compiled Code))

作为参考,空闲的线程(不等待释放某些东西)看起来像这样

  at sun/misc/Unsafe.park(Native Method)
  at java/util/concurrent/locks/LockSupport.parkNanos(LockSupport.java:222)
  at java/util/concurrent/locks/AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2127)
  at com/ibm/ws/util/BoundedBuffer$GetQueueLock.await(BoundedBuffer.java:285)
  at com/ibm/ws/util/BoundedBuffer.waitGet_(BoundedBuffer.java:424)
  at com/ibm/ws/util/BoundedBuffer.take(BoundedBuffer.java:817)
  at com/ibm/ws/util/ThreadPool.getTask(ThreadPool.java:934)
  at com/ibm/ws/util/ThreadPool$Worker.run(ThreadPool.java:1704)

或者

  at java/lang/Object.wait(Native Method)
  at java/lang/Object.wait(Object.java:231)
  at com/ibm/ws/util/BoundedBuffer.waitGet_(BoundedBuffer.java:192)
  at com/ibm/ws/util/BoundedBuffer.take(BoundedBuffer.java:543)
  at com/ibm/ws/util/ThreadPool.getTask(ThreadPool.java:819)
  at com/ibm/ws/util/ThreadPool$Worker.run(ThreadPool.java:1544)

而我的看起来都不像那些。

谢谢!

答案1

死锁的一般例子如下

  • 线程 1 持有资源 A 并且需要资源 B 才能继续
  • 线程 2 持有资源 B 并且需要资源 A 才能继续

在这种情况下,两个线程都无法进展,因此存在死锁。

您发布的片段与该模式不匹配,因此我不认为这是一个死锁。

需要注意的是,我不熟悉发布的代码片段中显示的特定代码,在我看来,显示的第一个线程只是在等待从 WorkManager 队列中获取任务,而该队列可能是空的。

另外,顺便说一句,在 IBM Java 线程转储中(您的代码片段似乎来自此),在创建转储的过程中会检测到死锁线程,并使用 DEADLOCK 标记进行标记。因此,您可以在 Java 线程转储中搜索该线程,以节省匹配所有可能的线程/资源组合以手动查找死锁所需的时间。

希望这可以帮助。

相关内容