仅在达到突发限制后才触发 Service OnFailure

Question

为了按照您的需要使用系统服务，您应该做几件事（更改正在进行中）/etc/systemd/system/python-test.service）。

改成Restart=alwaysRestart=on-failure
这些值StartLimitInterval=600似乎StartLimitBurst=5还得到支持。但是您应该将它们放在[Unit].如果您放置StartLimitInterval，[Unit]则可以将其重命名为StartLimitIntervalSec（改为man systemd.unit使用StartLimitIntervalSec）。
添加RemainAfterExit=no到[Service]部分。
在部分中添加此行[Service]：TimeoutStopSec=infinity
使用脚本中的环境变量EXIT_STATUS来确定脚本是否成功退出。
改成。OnFailure=mailer@%n.serviceOnFailure=mailer@%N.service两者的区别在于使用%N会删除后缀。
安装并启动服务atd( sudo systemctl start atd.service) 以便能够使用at命令。或者，如果您不想使用at，则可以编写另一个 systemd 服务来重新启动该服务。（在这个例子中，我使用了relaunch.service）
sleep在和上使用相同的值RestartSec。就您而言，既然RestartSec在60这一行中睡眠60也必须有：

 echo "sleep 60; sudo systemctl start ${1}.service" | at now

使用ExecStart和ExecStopPost=获取退出状态您的主要流程：/home/debian/tmp.py.不要使用ExecStop，来自man systemd.service：

执行停止=

请注意，ExecStop= 中指定的命令仅在服务首次成功启动时执行。如果服务根本没有启动过，或者启动失败，例如因为 ExecStart=、ExecStartPre= 或 ExecStartPost= 中指定的任何命令失败（并且没有前缀“-”），则不会调用它们，见上文）或超时。当服务无法正确启动并再次关闭时，使用 ExecStopPost= 调用命令。

服务/etc/systemd/system/python-test.service应该：

[Unit]
After=network.target
OnFailure=mailer@%N.service

StartLimitBurst=5
StartLimitIntervalSec=600
 
[Service]  
Type=simple 
TimeoutStopSec=infinity
ExecStart=/home/debian/tmp.py
ExecStopPost=/bin/bash -c 'echo The Service  has exited with values: $$EXIT_STATUS,$$SERVICE_RESULT,$$EXIT_CODE'
ExecStopPost=/home/debian/bin/checkSuccess "%N"
# Any exit status different than 0 is considered as an error
SuccessExitStatus=0
StandardOutput=append:/tmp/python-out-test.log
StandardError=append:/tmp/python-err-test.log
# Always restart service 60sec after exit
Restart=on-failure
RestartSec=60
RemainAfterExit=no

[Install]
WantedBy=multi-user.target

和/home/debian/bin/checkSuccess应该有这个：

解决方案一：使用at命令：

#!/bin/bash

if [ "$EXIT_STATUS" -eq 0 ]
then
   echo "sleep 60; sudo systemctl start ${1}.service" | at now
   exit 0
else
   systemctl start "mailer@${1}.service"
   exit 0
fi

解决方案2：使用另一个 systemd 服务：

#!/bin/bash

if [ "$EXIT_STATUS" -eq 0 ]
then
   systemctl start relaunch.service
else
   systemctl start "mailer@${1}.service"
fi
exit 0

并且relaunch.service应该有：

[Unit]
Description=Relaunch Python Test Service

[Service]
Type=simple
RemainAfterExit=no 
ExecStart=/bin/bash -c 'echo Delay; sleep 10 ; systemctl start python-test.service'

"$EXIT_STATUS"systemd 服务设置的变量由的退出状态决定/home/debian/tmp.py。

代表${1}单位的名称： python-test并将其传递给行中的脚本/home/debian/bin/checkSuccess "%N"。

笔记：

'echo The Service %n has exited with values: $$EXIT_STATUS,$$SERVICE_RESULT,$$EXIT_CODE' 您可以使用以下命令实时检查日志：

tail -f /tmp/python-out-test.log

relaunch.service如果您在想要停止主服务时使用解决方案 2（ with ），您应该运行：

sudo systemctl stop relaunch.service
#Might not be necessary but you stop python service too:
# sudo systemctl stop python-test.service

Answer 1