busybox crond 在前 10 分钟内避免并发进程?

busybox crond 在前 10 分钟内避免并发进程?

我有一个执行的脚本每一分钟由 Alpine 容器中的 cron 执行。该脚本可能运行很长时间,这就是它自行处理锁定的原因。

刚才我注意到 busyboxcrond避免在长时间运行的实例的前 10 分钟内执行脚本,只有在这之后才会恢复预期的计划。通过从脚本中记录来验证此行为;crond: user abc: process already running: sync.sh在预计触发脚本的上述期间,cron(见下文)也会记录此行为。

为什么会这样?其逻辑在哪里记录?


$ crond --help
BusyBox v1.36.1 (2023-07-27 17:12:24 UTC) multi-call binary.

Usage: crond [-fbS] [-l N] [-d N] [-L LOGFILE] [-c DIR]

    -f  Foreground
    -b  Background (default)
    -S  Log to syslog (default)
    -l N    Set log level. Most verbose 0, default 8
    -d N    Set log level, log to stderr
    -L FILE Log to FILE
    -c DIR  Cron dir. Default:/var/spool/cron/crontabs
$ cat /etc/crontabs/abc 
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=""

# m h dom mon dow     command
# ============================================================================
*/1 * * * *  sync.sh

Cron 在容器中启动/usr/sbin/crond -f -l 8 -L /dev/stdout -c /etc/crontabs

#   from docker host:
$ docker logs  my-container
usermod: no changes
crond: crond (busybox 1.36.1) started, log level 8
crond: USER abc pid  27 cmd sync.sh
crond: user abc: process already running: sync.sh
crond: USER root pid 158 cmd run-parts /etc/periodic/15min
crond: user abc: process already running: sync.sh
crond: user abc: process already running: sync.sh
crond: user abc: process already running: sync.sh
crond: user abc: process already running: sync.sh
crond: user abc: process already running: sync.sh
crond: user abc: process already running: sync.sh
crond: user abc: process already running: sync.sh
crond: user abc: process already running: sync.sh

/ around here 10 minute mark is hit, and crond starts invoking sync.sh again /

crond: USER abc pid 193 cmd sync.sh
sendmail: can't connect to remote host (127.0.0.1): Connection refused
crond: USER abc pid 219 cmd sync.sh
sendmail: can't connect to remote host (127.0.0.1): Connection refused
crond: USER abc pid 245 cmd sync.sh
sendmail: can't connect to remote host (127.0.0.1): Connection refused

上述脚本的精简版复制版:

#!/usr/bin/env bash

#####################################
readonly SELF="${0##*/}"
DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)"
JOB_ID="test-$$"
LOG="/config/${SELF}.log"

STATE_FILE=/tmp/test.state
#####################################

_prepare_locking()  { eval "exec 9>\"/tmp/test.lock\""; }
exlock_now()             { flock -xn 9; }

_log() {
    local lvl msg
    readonly lvl="$1"
    readonly msg="$2"
    echo -e "[$(date '+%F %T')] [$JOB_ID]\t$lvl  $msg" | tee -a "$LOG" >&2
    return 0
}

info() {
    _log INFO "$*"
}


update_statefile() {
    local time time_d

    _write_state() {
        echo -n "$time" > "$STATE_FILE"
    }

    time="$(date +%s)"

    if [[ -s "$STATE_FILE" ]]; then
        time_d="$((time - $(cat -- "$STATE_FILE")))"

        if [[ "$time_d" -ge 600 ]]; then
            info "$SELF has been running for at least ${time_d}s..."
            exit 1
        fi
    else
        _write_state
    fi

    info 'unable to obtain lock, process is already running'
    exit 0  # exit, not return
}


#### ENTRY ####
_prepare_locking || exit 1
exlock_now || update_statefile
info "$SELF starting up..."
sleep 20m  # block

exit 0

编辑: 到目前为止只能得出这样的结论:这是 busybox 的 cron 异常——此评论在相关的线程中似乎证实了这一点。

在 crontab 中明确使用后台进程&可以消除此现象。但缺点是,通过以下方式终止进程组kill -- -pid不再有效:

container-hostname:/# ps -ef | grep sync.sh
   34 abc       0:00 bash /usr/local/sbin/sync.sh
 1633 root      0:00 grep sync.sh
container-hostname:/# kill -9 -- -34
bash: kill: (-34) - No such process

答案1

我费了好大劲才发现这一点。这似乎是有意为之,并且特定于 busybox cron,基于一条评论在源中。

我相信可能是实际实现逻辑的地方。看起来它一直都是这样,所以怀疑它永远不会改变。

答案2

我可以确认此行为OpenWrt 23.05.3

要验证是否使用了此 crontab:

 * * * * * sleep 48 ; echo 48 
 * * * * * sleep 49 ; echo 49 
 * * * * * sleep 51 ; echo 51 

使用时logread -f你大概会看到后面的命令可能只会每隔一分钟出现一次...

相关内容