Systemd 状态在 ExecStop 失败时保持活动状态

Systemd 状态在 ExecStop 失败时保持活动状态

当脚本发生错误时ExecStop=/usr/lib/pick/d3shutdown_custom pick0,我希望 systemd 中止关闭过程并返回到状态active

当脚本以代码4退出时(用户权限不足),systemd陷入该deactivating状态。

这是我的 systemd 服务文件

[Unit]
Description=D3 multivalue database
After=d3_sn.service rslm.service
Requires=rslm.service
Wants=d3_sn.service

[Service]
Type=forking
ExecStart=/usr/bin/d3 -n pick0 -s -a x
ExecStop=/usr/lib/pick/d3shutdown_custom pick0
TimeoutSec=0
RemainAfterExit=true
SuccessExitStatus=11

[Install]
WantedBy=multi-user.target

d3shutdown_custom这是脚本的片段


log_error() {
    echo -e "Error: $1"
    echo "Aborting program ... "
    echo " "
}

if [ ! -f "$D3_PROPERTIES_FILE" ]; then log_error "File $D3_PROPERTIES_FILE not found."; exit 2; fi
source $D3_PROPERTIES_FILE
export PICKUSER
export PICKUPASS

if d3 -q | grep -q 'not running'; then
        exit 11
fi

D3TCL=$(/usr/bin/d3tcl 'listu')
if echo $D3TCL | grep -q 'Bad User or User Password'; then log_error "Bad password"; exit 4; fi

答案1

这是一个有点棘手的请求。当你systemctl stop做某事时,它应该将要总是往下走。它可能会失败,但即使systemd必须,它最终也会停止FinalKillSignal=(SIGKILL)之后TimeStopSec=(90s)

如果您的条件失败,那么“关闭”或“重新启动”之类的事情将不可能实现,而这些事情应该始终是可能的。

我认为实现这一点的最佳方法是Conflicts=单元和有条件地启动冲突单元的单元OnFailure=。这意味着你需要改变你的systemctl stop my-unit想法systemctl start maybe-kill-my-unit

这就是关闭的工作原理。我们不会阻止其他单位,我们systemctl start shutdown.target

c.service您可以通过阻止人们systemctl stop a.service直接使用来进一步强制使用RefuseManualStop=yes

a.service
[Unit]
Description=Long running service
RefuseManualStop=yes

[Service]
ExecStart=/bin/sleep 60
b.service
[Unit]
Description=Killer of the long running service
Conflicts=a

[Service]
Type=oneshot
ExecStart=/bin/true
c.service
[Unit]
Description=Conditional killer of the long runing service
OnFailure=b

[Service]
Type=oneshot
ExecStart=/usr/local/bin/myscript

这是一个实际的例子(c失败的地方)。

$ systemctl --user start a
$ systemctl --user start c 
Job for c.service failed because the control process exited with error code.

$ systemctl --user status a b c --lines 0
● a.service
     Loaded: loaded (.config/systemd/user/a.service; static)
     Active: inactive (dead) since Mon 2023-10-09 16:10:18 CEST; 3s ago

● b.service
     Loaded: loaded (.config/systemd/user/b.service; static)
     Active: inactive (dead) since Mon 2023-10-09 16:10:18 CEST; 3s ago

● c.service
     Loaded: loaded (.config/systemd/user/c.service; static)
     Active: failed (Result: exit-code) since Mon 2023-10-09 16:10:18 CEST; 3s ago

$ journalctl --user -u a -u b -u c --since "1 minute ago"
Oct 09 16:10:14: Long running service
Oct 09 16:10:18: Starting Conditional killer of the long runing service
Oct 09 16:10:18: c.service: Main process exited, code=exited, status=1/FAILURE
Oct 09 16:10:18: c.service: Triggering OnFailure= dependencies.
Oct 09 16:10:18: Starting Killer of the long running service...
Oct 09 16:10:18: Stopping Long running service...
Oct 09 16:10:18: Finished Killer of the long running service.

这是一个c成功的例子:

$ systemctl --user start a
$ systemctl --user start c
$ systemctl --user status a b c --lines 0
● a.service
     Loaded: loaded (.config/systemd/user/a.service; static)
     Active: active (running) since Mon 2023-10-09 16:14:45 CEST; 9s ago

● b.service
     Loaded: loaded (.config/systemd/user/b.service; static)
     Active: inactive (dead)

● c.service
     Loaded: loaded (.config/systemd/user/c.service; static)
     Active: inactive (dead)

$ journalctl --user -u a -u b -u c --since "1 minute ago"
Oct 09 16:14:45: Started Long running service.
Oct 09 16:14:50: Starting Conditional killer of the long runing service...
Oct 09 16:14:50: Finished Conditional killer of the long runing service.

a如果c失败,此示例将停止。如果你的脚本被颠倒了(如果c成功应该停止)你可以玩SuccessExitStatus=


如果您有最新版本的 systemd,您可以使用以下命令使其更加优雅ExecCondition=结合OnSuccess=。这将避免任何单元处于失败状态,避免污染全局“降级”systemctl status并避免在 stderr 上留下错误消息。为此,请ExecCondition=运行您的脚本。如果成功,那么ExecStart=就会成功,OnSuccess=触发动作b.service,杀人a.service。如果ExecCondition=失败,则ExecStart=永远不会运行,也OnSuccess=永远不会触发,允许a.service生存。

c.service
[Unit]
OnSuccess=b.service

[Service]
Type=oneshot
ExecCondition=/your-script
ExecStart=/bin/true

相关内容