我已经让 munin-node 在我的计算机上成功运行了一段时间,但最近它再也无法启动了。没有 munin-node 日志可供我检查,也systemctl status munin-node
没有提供很多有用的信息:
[root@host /]# systemctl status munin-node
● munin-node.service - Munin Node
Loaded: loaded (/usr/lib/systemd/system/munin-node.service; enabled; vendor preset: disabled)
Active: failed (Result: timeout) since Tue 2021-05-18 23:35:16 CEST; 1h 8min ago
Docs: man:munin-node(1)
http://guide.munin-monitoring.org/en/latest/node/index.html
Process: 7710 ExecStart=/usr/sbin/munin-node --foreground (code=exited, status=0/SUCCESS)
Main PID: 7710 (code=exited, status=0/SUCCESS)
May 18 23:33:44 host systemd[1]: Starting Munin Node...
May 18 23:35:14 host systemd[1]: munin-node.service start operation timed out. Terminating.
May 18 23:35:16 host systemd[1]: Failed to start Munin Node.
May 18 23:35:16 host systemd[1]: Unit munin-node.service entered failed state.
May 18 23:35:16 host systemd[1]: munin-node.service failed.
答案1
问题出在插件耗时过长,特别是nvidia_gpu_*
因为这是一台多 GPU 机器,插件耗时过长。没有明确的迹象表明插件导致了超时。
为了加快nvidia_gpu_*
插件的速度,我使用了以下命令,基于https://forums.developer.nvidia.com/t/nvidia-smi-is-slow-on-ubuntu-16-04/50416:
nvidia_smi --persistence-mode 1
只需运行该命令即可测试其效果nvidia_smi
,由于无需先唤醒 GPU,因此加载速度会更快。
答案2
对于 Debian,可以在以下方式增加超时时间/etc/init.d/munin-node
:
START_ARGS="--背景--通知等待--通知超时 15"
例如
START_ARGS="--后台--通知等待--通知超时 120"
它帮我解决了 SysV Bullseye 系统上的问题。
如图所示:https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=954128