Restart=always 在所需服务失败后不重试重新启动

Restart=always 在所需服务失败后不重试重新启动

我有一个kubelet.service需要docker.service. kubelet.service 是这样的:

[Unit]
Description=kubelet
After=docker.service
Requires=docker.service

[Service]
WorkingDirectory=/var/lib/kubelet
EnvironmentFile=-/etc/kubernetes/kubelet
ExecStart=/usr/local/bin/kubelet ...
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

docker.service重启时,kubelet.service会自动重启。但是,当我重新启动服务器时,如果docker.service启动失败,那么 kubelet.service 将不会自动重新启动!

# journalctl -u kubelet.service 
-- Logs begin at Fri 2018-05-25 09:35:00 CST, end at Fri 2018-05-25 09:53:13 CST. --
May 25 09:35:03 debian systemd[1]: Dependency failed for kubelet.
May 25 09:35:03 debian systemd[1]: kubelet.service: Job kubelet.service/start failed with result 'dependency'.

# journalctl -u docker
-- Logs begin at Fri 2018-05-25 09:35:00 CST, end at Fri 2018-05-25 09:53:46 CST. --
May 25 09:35:03 debian systemd[1]: Starting Docker Application Container Engine...
May 25 09:35:03 debian dockerd[1905]: invalid value "" for flag --mtu: strconv.ParseInt: parsing "": invalid syntax
May 25 09:35:03 debian dockerd[1905]: See '/usr/bin/dockerd --help'.
May 25 09:35:03 debian systemd[1]: docker.service: Main process exited, code=exited, status=125/n/a
May 25 09:35:03 debian systemd[1]: Failed to start Docker Application Container Engine.
May 25 09:35:03 debian systemd[1]: docker.service: Unit entered failed state.
May 25 09:35:03 debian systemd[1]: docker.service: Failed with result 'exit-code'.
May 25 09:35:12 debian systemd[1]: docker.service: Service hold-off time over, scheduling restart.
May 25 09:35:12 debian systemd[1]: Stopped Docker Application Container Engine.
May 25 09:35:12 debian systemd[1]: Starting Docker Application Container Engine...

如您所见,kubelet 在 09:35:03 停止,即使 docker 在 09:35:12 正常启动后,它也永远不会重新启动

答案1

您的重新启动尝试已达到极限。审查开始限制突发。类似问题这里

答案2

Restart不是指失败的依赖项,而是指属于该单元的进程。

man systemd.service:

Restart=
配置当服务进程退出、被杀死或超时时是否重新启动服务。该服务进程可能是主服务进程,但也可能是用 ExecStartPre=、ExecStartPost=、ExecStop=、ExecStopPost= 或 ExecReload= 指定的进程之一。当进程的死亡是由于systemd操作(例如服务停止或重新启动)而导致时,服务将不会重新启动。超时包括错过看门狗“keep-alive ping”截止时间以及服务启动、重新加载和停止操作超时。

相关内容