如何使服务失败以使 SERVICE_RESULT 包含“start-limit-hit”

如何使服务失败以使 SERVICE_RESULT 包含“start-limit-hit”

因此,我有一个 systemd 服务,它在 ExecStopPost 中执行脚本:

[Unit]
Description=foobar-test
StartLimitBurst=3
StartLimitIntervalSec=120

[Service]
Type=simple
ExecStart=/do_something.sh
ExecStopPost=/handle_stop.sh $SERVICE_RESULT
Restart=on-failure
RestartSec=1

[Install]
...

脚本“do_something.sh”仅退出并返回代码 1,以便服务在失败时不断重新启动,直到达到启动限制。

脚本“handle_stop.sh”仅打印 $SERVICE_RESULT 的内容

当服务达到启动限制时, $SERVICE_RESULT 的内容是“退出代码”而不是“启动限制命中”,正如我预期的那样文档

我做错了什么还是这是 systemd 中的错误?

systemd 版本:systemd 244 (244.5+) +PAM -AUDIT -SELINUX +IMA -APPARMOR -SMACK +SYSVINIT +UTMP -LIBCRYPTSETUP -GCRYPT -GNUTLS +ACL +XZ -LZ4 -SECCOMP +BLKID -ELFUTILS +KMOD -IDN2 -IDN - PCRE2 默认层次结构=混合

答案1

man 5 systemd.exec说:

“start-limit-hit”:为单元定义了启动限制,并且达到了限制,导致单元无法启动。

man 5 systemd.unit说:

请注意,配置为 Restart= 且达到启动限制的单元将不再尝试重新启动;但是,在间隔过后,它们仍可以手动或从计时器或套接字重新启动。

您可能会争辩说这systemd是按照文档进行的。 start-limit-hit仅当设备无法启动时才会发生。但Restart=不会尝试(重新)启动已StartLimitBurst=在 内启动过的单元StartLimitIntervalSec=

但我确实花了一些努力试图让这种方法出现。

首先我尝试触发突发,然后尝试另一个systemctl start手动启动它。假设是该单元在达到启动限制命中后从未启动,因为 systemd 不会重新启动超过该限制的单元。如果属实,则在本实验中,当明确命令启动并给出所需代码时,该单元将无法启动。

$ systemctl --user cat startlimit.service 
# ~/.config/systemd/user/startlimit.service
[Service]
ExecStart=/bin/false
Restart=always
StartLimitIntervalSec=20

$ systemctl --user start startlimit.service
$ systemctl --user start startlimit.service
Job for startlimit.service failed because the control process exited with error code.
See "systemctl --user status startlimit.service" and "journalctl --user -xeu startlimit.service" for details.

$ systemctl --user status startlimit.service
× startlimit.service
     Loaded: loaded (/home/stew/.config/systemd/user/startlimit.service; static)
     Active: failed (Result: exit-code) since Mon 2022-10-24 18:10:11 CEST; 5s ago
   Duration: 1ms
    Process: 5008 ExecStart=/bin/false (code=exited, status=1/FAILURE)
   Main PID: 5008 (code=exited, status=1/FAILURE)
        CPU: 1ms

Oct 24 18:10:11 systemd[1129]: startlimit.service: Scheduled restart job, restart counter is at 5.
Oct 24 18:10:11 systemd[1129]: Stopped startlimit.service.
Oct 24 18:10:11 systemd[1129]: startlimit.service: Start request repeated too quickly.
Oct 24 18:10:11 systemd[1129]: startlimit.service: Failed with result 'exit-code'.
Oct 24 18:10:11 systemd[1129]: Failed to start startlimit.service.
Oct 24 18:10:12 systemd[1129]: startlimit.service: Start request repeated too quickly.
Oct 24 18:10:12 systemd[1129]: startlimit.service: Failed with result 'exit-code'.
Oct 24 18:10:12 systemd[1129]: Failed to start startlimit.service.

这个假设被证明是错误的。

下一个猜测是“也许 systemctl 将拒绝启动超过该限制的单元”。 “也许这是 systemd 需要自行触发的东西”。为此,我创建了另一个一次性虚拟服务,即Wants=上面的服务。然后我触发了几次。

$ systemctl --user cat startlimit{,Wanter.service}
# ~/.config/systemd/user/startlimit.service
[Service]
ExecStart=/bin/false
Restart=always
StartLimitIntervalSec=20

[Install]
WantedBy=startlimitWanter.service

# ~/.config/systemd/user/startlimitWanter.service
[Service]
Type=oneshot
ExecStart=/bin/true

$ systemctl --user start startlimitWanter.service
$ systemctl --user start startlimitWanter.service
$ systemctl --user start startlimitWanter.service
$ systemctl --user status startlimit{,Wanter.service}
× startlimit.service
     Loaded: loaded (~/.config/systemd/user/startlimit.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Mon 2022-10-24 18:13:49 CEST; 7s ago
   Duration: 1ms
    Process: 5066 ExecStart=/bin/false (code=exited, status=1/FAILURE)
   Main PID: 5066 (code=exited, status=1/FAILURE)
        CPU: 1ms


Oct 24 18:13:50 systemd[1129]: Failed to start startlimit.service.
Oct 24 18:13:52 systemd[1129]: startlimit.service: Start request repeated too quickly.
Oct 24 18:13:52 systemd[1129]: startlimit.service: Failed with result 'exit-code'.
Oct 24 18:13:52 systemd[1129]: Failed to start startlimit.service.

○ startlimitWanter.service
     Loaded: loaded (~/.config/systemd/user/startlimitWanter.service; static)
     Active: inactive (dead)

Oct 24 18:13:48 systemd[1129]: Starting startlimitWanter.service...
Oct 24 18:13:48 systemd[1129]: Finished startlimitWanter.service.
Oct 24 18:13:50 systemd[1129]: Starting startlimitWanter.service...
Oct 24 18:13:50 systemd[1129]: Finished startlimitWanter.service.
Oct 24 18:13:52 systemd[1129]: Starting startlimitWanter.service...
Oct 24 18:13:52 systemd[1129]: Finished startlimitWanter.service.

那个人也没有收到你的消息。

在这一点上,我倾向于同意您偶然发现了一个错误。

相关内容