CMS 垃圾收集运行后，WebLogic 服务器似乎冻结

Question

经过长时间的调查，我们终于找到了问题的原因，在 CMS GC 运行时，我们进行了几次线程转储，并在 5 或 6 个线程中发现了这个显眼的堆栈：

"[ACTIVE] ExecuteThread: '35' for queue: 'weblogic.kernel.Default (self-tuning)'" daemon prio=3 tid=0x000000001333c000 nid=0xa1 runnable [0xfffffd7e23dd1000] 
   java.lang.Thread.State: RUNNABLE 
at sun.security.pkcs11.wrapper.PKCS11.C_DestroyObject(Native Method) 
at sun.security.pkcs11.SessionKeyRef.dispose(P11Key.java:1043) 
at sun.security.pkcs11.SessionKeyRef.drainRefQueueBounded(P11Key.java:1019) 
at sun.security.pkcs11.SessionKeyRef. (P11Key.java:1034) 
at sun.security.pkcs11.P11Key. (P11Key.java:98) 
at sun.security.pkcs11.P11Key$P11SecretKey. (P11Key.java:379) 
at sun.security.pkcs11.P11Key.secretKey(P11Key.java:271) 
at sun.security.pkcs11.P11TlsRsaPremasterSecretGenerator.engineGenerateKey(P11TlsRsaPremasterSecretGenerator.java:84) 
at javax.crypto.KeyGenerator.generateKey(DashoA13*..) 
at com.sun.net.ssl.internal.ssl.RSAClientKeyExchange. (RSAClientKeyExchange.java:68) 
at com.sun.net.ssl.internal.ssl.ClientHandshaker.serverHelloDone(ClientHandshaker.java:807) 
at com.sun.net.ssl.internal.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:286) 
at com.sun.net.ssl.internal.ssl.Handshaker.processLoop(Handshaker.java:608) 
at com.sun.net.ssl.internal.ssl.Handshaker$1.run(Handshaker.java:548) 
at java.security.AccessController.doPrivileged(Native Method) 
at com.sun.net.ssl.internal.ssl.Handshaker$DelegatedTask.run(Handshaker.java:941) 
- locked <0xfffffd7e402b5ca0> (a com.sun.net.ssl.internal.ssl.SSLEngineImpl)

该代码的执行出现在相隔 20-25 秒的转储中的同一个线程中，因此它看起来是一个很有可能的嫌疑。

查看 Oracle 支持文档和以下链接后：

https://bugs.openjdk.java.net/browse/JDK-8059337 https://stackoverflow.com/questions/31188663/disabling-pkcs11-solaris-implementation

我们决定使用“-Dsun.security.pkcs11.enable-solaris=false”禁用 Solaris pkcs11 实现。一旦我们这样做，所有超时都消失了，GC 之后访问日志中的间隙不再存在。

我不确定使用 sun pcks11 相对于其他提供商是否有任何优势，但禁用它似乎可以解决我们的问题，并且到目前为止还没有发现由更改产生的新问题，所以我们将坚持使用它。

Answer 1

经过长时间的调查，我们终于找到了问题的原因，在 CMS GC 运行时，我们进行了几次线程转储，并在 5 或 6 个线程中发现了这个显眼的堆栈：

"[ACTIVE] ExecuteThread: '35' for queue: 'weblogic.kernel.Default (self-tuning)'" daemon prio=3 tid=0x000000001333c000 nid=0xa1 runnable [0xfffffd7e23dd1000] 
   java.lang.Thread.State: RUNNABLE 
at sun.security.pkcs11.wrapper.PKCS11.C_DestroyObject(Native Method) 
at sun.security.pkcs11.SessionKeyRef.dispose(P11Key.java:1043) 
at sun.security.pkcs11.SessionKeyRef.drainRefQueueBounded(P11Key.java:1019) 
at sun.security.pkcs11.SessionKeyRef. (P11Key.java:1034) 
at sun.security.pkcs11.P11Key. (P11Key.java:98) 
at sun.security.pkcs11.P11Key$P11SecretKey. (P11Key.java:379) 
at sun.security.pkcs11.P11Key.secretKey(P11Key.java:271) 
at sun.security.pkcs11.P11TlsRsaPremasterSecretGenerator.engineGenerateKey(P11TlsRsaPremasterSecretGenerator.java:84) 
at javax.crypto.KeyGenerator.generateKey(DashoA13*..) 
at com.sun.net.ssl.internal.ssl.RSAClientKeyExchange. (RSAClientKeyExchange.java:68) 
at com.sun.net.ssl.internal.ssl.ClientHandshaker.serverHelloDone(ClientHandshaker.java:807) 
at com.sun.net.ssl.internal.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:286) 
at com.sun.net.ssl.internal.ssl.Handshaker.processLoop(Handshaker.java:608) 
at com.sun.net.ssl.internal.ssl.Handshaker$1.run(Handshaker.java:548) 
at java.security.AccessController.doPrivileged(Native Method) 
at com.sun.net.ssl.internal.ssl.Handshaker$DelegatedTask.run(Handshaker.java:941) 
- locked <0xfffffd7e402b5ca0> (a com.sun.net.ssl.internal.ssl.SSLEngineImpl)