这对我来说是令人困惑的行为。我在 aws s3sync 命令运行时向我的脚本发送终止信号,尽管我处理了 sigterm 错误,但错误陷阱也由 awssync 命令触发,我不明白为什么。更令人困惑的是,该命令会抛出错误并继续:
脚本:
#! /bin/bash
trap 'echo GOT ERROR, exiting' ERR
trap 'echo GOT SIGTERM!' SIGTERM
while true; do
date +%F_%T
aws s3 cp /vagrant/audio/ s3://testarchive/tester/ --recursive
sleep 1
done
运行脚本的命令:
timeout 5s ./tester.sh
输出:
upload: ../../vagrant/audio/2019-09-16/3/35322118-8264-406B-961B-EAF1FE7A34EF.wav to s3://testarchive/tester/2019-09-16/3/35322118-8264-406B-961B-EAF1FE7A34EF.wav
upload: ../../vagrant/audio/2019-09-16/1/165BD3D0-773A-4591-A43E-D67810716066.wav to s3://testarchive/tester/2019-09-16/1/165BD3D0-773A-4591-A43E-D67810716066.wav
upload: ../../vagrant/audio/2019-09-16/2/2A9559BB-168A-47D2-943A-A51B7885233B.wav to s3://testarchive/tester/2019-09-16/2/2A9559BB-168A-47D2-943A-A51B7885233B.wav
Terminated6.8 MiB/123.1 MiB (1.5 MiB/s) with 422 file(s) remaining
GOT ERROR, exiting
GOT SIGTERM!
2020-01-17_21:05:40
upload: ../../vagrant/audio/2019-09-16/0/07502A17-9304-4995-94E1-A1B0D439EEE7.wav to s3://testarchive/tester/2019-09-16/0/07502A17-9304-4995-94E1-A1B0D439EEE7.wav
upload: ../../vagrant/audio/2019-09-16/0/05E4C765-C2FA-4EC0-9803-8FF02C0FEDDE.wav to s3://testarchive/tester/2019-09-16/0/05E4C765-C2FA-4EC0-9803-8FF02C0FEDDE.wav
upload: ../../vagrant/audio/2019-09-
编辑#2:
29 1 * * * root strace -e trace=kill timeout --foreground 6 /home/vagrant/tester.sh &> /home/vagrant/tester.log
#! /bin/bash
trap 'echo GOT ERROR..' ERR
trap 'echo GOT SIGTERM! && set_terminate_flag' SIGTERM
terminate_flag=false
function set_terminate_flag {
terminate_flag=true
}
while true; do
if [ "$terminate_flag" = true ]; then
echo OMG IT WORKS!
exit 0
fi
date +%F_%T
aws s3 cp /vagrant/audio/ s3://testarchive/tester/ --recursive
echo LOOP IS Done, begin sleep
done
输出:
...(skip output, 6 seconds have passed!!!)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_value={int=14831664, ptr=0xe25030}} ---
kill(9432, SIGTERM) = 0
kill(9432, SIGCONT) = 0
...(skip output)
upload: ../vagrant/audio/2020-01-01/E7914F83-8A89-4679-ABBC-8DB261D13349-01.wav to s3://testarchive/tester/2020-01-01/E7914F83-8A89-4679-ABBC-8DB261D13349-01.wav
GOT SIGTERM!
LOOP IS Done, begin sleep
OMG IT WORKS!
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=9432, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
+++ exited with 124 +++
答案1
trap 'echo GOT ERROR, exiting' ERR
简单地说“退出”并不意味着它也是如此;-)
ERR
每当有命令时就会执行陷阱失败,无论脚本随后是否立即退出(例如,因为set -e
):
$ bash -c 'trap "echo error, not exiting yet" ERR; false; echo DONE'
error, not exiting yet
DONE
在您的情况下,失败的命令可能是date
or aws
,但最有可能的是sleep
(这是外部命令,而不是内置命令)。sleep
以非零状态退出(=失败,从而ERR
触发陷阱),因为它也被发送的信号杀死timeout
:timeout
首先向其子级发送信号,然后向整个子级发送信号进程组它是以下的一部分:
$ strace -e trace=kill timeout 1s bash -c 'echo $$; while :; do sleep 3600; done
'
4851
...
kill(4851, SIGTERM) = 0
kill(0, SIGTERM) = 0
...
shell 不会运行任何陷阱,直到后它正在等待的前台命令已退出;如果timeout
没有发出整个信号进程组(可以通过选项来实现--foreground
),前台命令可能不会退出,并且陷阱可能不会运行:
$ timeout 1s bash -c 'trap "echo TERM caught" TERM; sleep 36000; echo DONE'
Terminated
TERM caught
DONE
$ timeout --foreground 1s bash -c 'trap "echo TERM caught" TERM; sleep 36000; echo DONE'
<wait and wait>