当注意到 nginx 出现故障时,状态报告将确认非活动状态:
nginx.service - A high performance web server and a reverse proxy server
Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
Active: failed (Result: core-dump) since Wed 2022-09-14 03:48:56 UTC; 1h 49min ago
Docs: man:nginx(8)
Process: 300175 ExecStartPre=/usr/sbin/nginx -t -q -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Process: 300189 ExecStart=/usr/sbin/nginx -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Main PID: 300203 (code=dumped, signal=SEGV)
Tasks: 0 (limit: 2339)
Memory: 76.8M
CGroup: /system.slice/nginx.service
之后sudo service nginx restart
:
service nginx status
● nginx.service - A high performance web server and a reverse proxy server
Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2022-09-14 05:39:37 UTC; 3min 22s ago
Docs: man:nginx(8)
Process: 349112 ExecStartPre=/usr/sbin/nginx -t -q -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Process: 349113 ExecStart=/usr/sbin/nginx -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Main PID: 349125 (nginx)
Tasks: 25 (limit: 2339)
Memory: 225.4M
CGroup: /system.slice/nginx.service
├─349114 Passenger watchdog
├─349117 Passenger core
├─349125 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
├─349130 nginx: worker process
├─349791 Passenger AppPreloader: /.../simon/current
└─349870 Passenger RubyApp: /.../simon/current (development)
目前,我只看到不太活跃的服务器的可用内存大幅下降。
a) 对内存问题(泄漏?)的分析正确吗?b) 如何缓解这种情况?
更新 不到一小时(54 分钟)内:
Active: active (running) since Wed 2022-09-14 05:39:37 UTC; 54min ago
Docs: man:nginx(8)
Process: 349112 ExecStartPre=/usr/sbin/nginx -t -q -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Process: 349113 ExecStart=/usr/sbin/nginx -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Main PID: 349125 (nginx)
Tasks: 30 (limit: 2339)
Memory: 376.3M
对于 35 行 access.log 和 8 行 error.log 更新 2 在 nginx 再次崩溃之前的 30 多个小时内观察到以下数据:
1h 36min ago
Tasks: 41 (limit: 2339)
Memory: 548.7M
4h 49min ago
Tasks: 41 (limit: 2339)
Memory: 566.2M
8h ago
Tasks: 41 (limit: 2339)
Memory: 569.0M
10h ago
Tasks: 41 (limit: 2339)
Memory: 568.4M
13h ago
Tasks: 41 (limit: 2339)
Memory: 573.7M
14h ago
Tasks: 41 (limit: 2339)
Memory: 580.5M
15h ago
Tasks: 41 (limit: 2339)
Memory: 578.3M
├─350286 Passenger RubyApp: /.../simon/current (development)
├─350936 Passenger RubyApp: /.../market/current (development)
└─353112 Passenger RubyApp: /.../fido/current (development)
24h ago
Tasks: 27 (limit: 2339)
Memory: 297.2M
├─389632 nginx: worker process
└─405094 Passenger RubyApp: /.../fido/current (development)
24h ago
Tasks: 36 (limit: 2339)
Memory: 513.1M
24h ago
Tasks: 54 (limit: 2339)
Memory: 673.2M
CGroup: /system.slice/nginx.service
├─349125 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
├─389611 Passenger watchdog
├─389614 Passenger core
├─389632 nginx: worker process
├─405094 Passenger RubyApp: /.../fido/current (development)
├─408557 Passenger AppPreloader: /.../simon/current
├─408640 Passenger RubyApp: /.../simon/current (development)
├─408944 Passenger AppPreloader: /.../market/current
└─409036 Passenger RubyApp: /.../market/current (development)
24h ago
Tasks: 41 (limit: 2339)
Memory: 591.4M
CGroup: /system.slice/nginx.service
├─405094 Passenger RubyApp: /.../fido/current (development)
├─408640 Passenger RubyApp: /.../simon/current (development)
└─409036 Passenger RubyApp: /.../market/current (development)
Active: failed (Result: core-dump) since Thu 2022-09-15 11:57:46 UTC; 1h 20min ago
Docs: man:nginx(8)
Process: 349112 ExecStartPre=/usr/sbin/nginx -t -q -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Process: 349113 ExecStart=/usr/sbin/nginx -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Main PID: 349125 (code=dumped, signal=SEGV)
Tasks: 0 (limit: 2339)
Memory: 147.4M
CGroup: /system.slice/nginx.service
观察结果:
• 进程稳定在 41
• 内存稳定,但在下降后无缘无故地有所上升。
• nginx 似乎会让一些未使用的进程进入睡眠状态(线程丢失)。
• 唤醒这些应用程序会恢复这些进程。
当时的 error.log 指出:
2022/09/15 11:57:32 [info] 422864#422864: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:63
2022/09/15 11:57:34 [notice] 422870#422870: signal process started
[ N 2022-09-15 11:57:34.2124 422874/T1 age/Wat/WatchdogMain.cpp:1373 ]: Starting Passenger watchdog...
[ N 2022-09-15 11:57:34.2901 422878/T1 age/Cor/CoreMain.cpp:1340 ]: Starting Passenger core...
[ N 2022-09-15 11:57:34.2903 422878/T1 age/Cor/CoreMain.cpp:256 ]: Passenger core running in multi-application mode.
[ N 2022-09-15 11:57:34.3105 422878/T1 age/Cor/CoreMain.cpp:1015 ]: Passenger core online, PID 422878
[ N 2022-09-15 11:57:36.4497 422878/T5 age/Cor/SecurityUpdateChecker.h:519 ]: Security update check: no update found (next check in 24 hours)
2022/09/15 11:57:46 [notice] 422902#422902: signal process started
[ N 2022-09-15 11:57:46.0873 422878/T9 age/Cor/CoreMain.cpp:670 ]: Signal received. Gracefully shutting down... (send signal 2 more time(s) to force shutdown)
[ N 2022-09-15 11:57:46.0874 422878/T1 age/Cor/CoreMain.cpp:1245 ]: Received command to shutdown gracefully. Waiting until all clients have disconnected...
[ N 2022-09-15 11:57:46.0875 422878/Tb Ser/Server.h:901 ]: [ApiServer] Freed 0 spare client objects
[ N 2022-09-15 11:57:46.0875 422878/Tb Ser/Server.h:558 ]: [ApiServer] Shutdown finished
[ N 2022-09-15 11:57:46.0875 422878/T9 Ser/Server.h:901 ]: [ServerThr.1] Freed 0 spare client objects
[ N 2022-09-15 11:57:46.0876 422878/T9 Ser/Server.h:558 ]: [ServerThr.1] Shutdown finished
[ N 2022-09-15 11:57:46.2320 422878/T1 age/Cor/CoreMain.cpp:1325 ]: Passenger core shutdown finished