我有一个生产服务器阿帕奇2.4.38在Debian 10有时,Web 服务器无法正常工作,并且不会立即发送对其收到的 HTTP 请求的响应(其上的所有虚拟主机请求都完全无响应(无论它们反向代理到什么))。重新启动后,它会立即自行修复,或者在这样一段时间(几秒甚至几分钟)后,它会突然开始发送大量 HTTP 响应。
CPU 和 RAM 使用率看起来不错,但事实并非如此。我不知道到底发生了什么以及为什么这样做。我还更改了 mpm_event.conf 设置,它们当前设置为:
<IfModule mpm_event_module>
StartServers 2
ServerLimit 100
MinSpareThreads 25
MaxSpareThreads 75
ThreadLimit 128
ThreadsPerChild 25
MaxRequestWorkers 400
MaxConnectionsPerChild 5000
</IfModule>
我在 Apache 错误日志中看到了一些错误:
[Tue Mar 22 19:53:38.339703 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 29595 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.339777 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 26190 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.339825 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 27903 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.339889 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 16907 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.339933 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 26880 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.340000 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 15384 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.340041 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 24971 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.340091 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 9780 still did not exit, sending a SIGKILL
[Tue Mar 22 19:53:38.340130 2022] [core:error] [pid 3375:tid 140244229465216] AH00046: child process 26317 still did not exit, sending a SIGKILL
我可以更改哪些设置来解决此问题?
答案1
看起来好像有什么东西挂起了你的工作进程。那很糟。要么某些插件挂起,要么您可能遇到硬件问题。您在 apache 中发现错误的可能性较小。
我会检查dmesg
和systemctl
是否有错误,尤其是有关存储的错误。
如果这个简单的检查没有产生任何结果,请附加gdb
到您的 apache2 进程并查看子进程在不退出时挂起的确切位置。也许这是一个尝试解析名称的插件,但解析随机不可用?也许是访问 NFS 上不可可靠访问的文件?从这里很难知道。