Apache 服务器有时会因请求积压并等待太多时间而卡住几分钟

Apache 服务器有时会因请求积压并等待太多时间而卡住几分钟

我有一个生产服务器阿帕奇2.4.38Debian 10有时,Web 服务器无法正常工作,并且不会立即发送对其收到的 HTTP 请求的响应(其上的所有虚拟主机请求都完全无响应(无论它们反向代理到什么))。重新启动后,它会立即自行修复,或者在这样一段时间(几秒甚至几分钟)后,它会突然开始发送大量 HTTP 响应。

CPU 和 RAM 使用率看起来不错,但事实并非如此。我不知道到底发生了什么以及为什么这样做。我还更改了 mpm_event.conf 设置,它们当前设置为:

<IfModule mpm_event_module>
        StartServers                     2
        ServerLimit 100
        MinSpareThreads          25
        MaxSpareThreads          75
        ThreadLimit                      128
        ThreadsPerChild          25
        MaxRequestWorkers         400
        MaxConnectionsPerChild   5000
</IfModule>

我在 Apache 错误日志中看到了一些错误:

[Tue Mar 22 19:53:38.339703 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 29595 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.339777 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 26190 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.339825 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 27903 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.339889 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 16907 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.339933 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 26880 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.340000 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 15384 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.340041 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 24971 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.340091 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 9780 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.340130 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 26317 still did not exit, sending a SIGKILL

我可以更改哪些设置来解决此问题?

答案1

看起来好像有什么东西挂起了你的工作进程。那很糟。要么某些插件挂起,要么您可能遇到硬件问题。您在 apache 中发现错误的可能性较小。

我会检查dmesgsystemctl是否有错误,尤其是有关存储的错误。

如果这个简单的检查没有产生任何结果,请附加gdb到您的 apache2 进程并查看子进程在不退出时挂起的确切位置。也许这是一个尝试解析名称的插件,但解析随机不可用?也许是访问 NFS 上不可可靠访问的文件?从这里很难知道。

相关内容