使用journalctl -u docker
我注意到
May 30 10:01:43 xxx systemd[1]: Stopping Docker Application Container Engine...
...
docker specific error log in between
...
May 30 10:01:51 xxx systemd[1]: Stopped Docker Application Container Engine...
我看见/var/log/auth.log也没有尝试任何码头工人整个星期的入场。
未发现终止尝试根历史,以及我们共同的用户
systemd入口:
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
BindsTo=containerd.service
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3
# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Comment TasksMax if your systemd version does not support it.
# Only systemd 226 and above support this option.
TasksMax=infinity
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
[Install]
WantedBy=multi-user.target
我甚至不知道为什么它无法重新启动。看起来好像有人手动终止了服务。
据我理解。系统化如果服务因该问题而停止,至少应该尝试重新启动服务。这让我认为这是某人的要求。
如何解决这个问题?
Docker 版本 19.03.8,构建 afacb8b7f0。正常运行时间 28 天。
最近出现了内存泄漏问题,几乎所有东西都被占用了。但我在日志中没有看到有关内存的信息。
/var/log/kern.log 中的 UPD OOM killer (感谢@Abhijith):
May 30 10:01:42 compute03 kernel: [2263822.755824] [ pid ] uid tgid
total_vm rss pgtables_bytes swapents oom_score_adj name
May 30 10:01:42 compute03 kernel: [2263822.755829] [ 404] 0 404 71910 1 540672 3377 0 systemd-journal
May 30 10:01:42 compute03 kernel: [2263822.755830] [ 414] 0 414 10905 0 122880 372 -1000 systemd-udevd
May 30 10:01:42 compute03 kernel: [2263822.755831] [ 417] 0 417 24427 0 94208 55 0 lvmetad
May 30 10:01:42 compute03 kernel: [2263822.755833] [ 606] 62583 606 35484 0 184320 187 0 systemd-timesyn
May 30 10:01:42 compute03 kernel: [2263822.755834] [ 655] 100 655 18265 0 167936 385 0 systemd-network
May 30 10:01:42 compute03 kernel: [2263822.755835] [ 678] 101 678 17693 0 184320 200 0 systemd-resolve
May 30 10:01:42 compute03 kernel: [2263822.755836] [ 890] 0 890 27604 20 118784 64 0 irqbalance
May 30 10:01:42 compute03 kernel: [2263822.755837] [ 898] 0 898 17670 0 184320 218 0 systemd-logind
May 30 10:01:42 compute03 kernel: [2263822.755838] [ 899] 0 899 169538 0 147456 219 0 lxcfs
May 30 10:01:42 compute03 kernel: [2263822.755839] [ 901] 103 901 12544 0 143360 199 -900 dbus-daemon
May 30 10:01:42 compute03 kernel: [2263822.755840] [ 905] 0 905 7507 0 102400 72 0 cron
May 30 10:01:42 compute03 kernel: [2263822.755841] [ 907] 0 907 7083 0 106496 58 0 atd
May 30 10:01:42 compute03 kernel: [2263822.755842] [ 908] 0 908 71588 0 192512 260 0 accounts-daemon
May 30 10:01:42 compute03 kernel: [2263822.755843] [ 909] 102 909 65758 0 172032 461 0 rsyslogd
May 30 10:01:42 compute03 kernel: [2263822.755844] [ 916] 0 916 42372 0 233472 2022 0 networkd-dispat
May 30 10:01:42 compute03 kernel: [2263822.755845] [ 921] 0 921 301259 0 348160 6201 0 containerd
May 30 10:01:42 compute03 kernel: [2263822.755846] [ 923] 112 923 26804 0 233472 291 0 zabbix_agentd
May 30 10:01:42 compute03 kernel: [2263822.755847] [ 929] 0 929 46488 0 262144 2000 0 unattended-upgr
May 30 10:01:42 compute03 kernel: [2263822.755848] [ 931] 0 931 300744 120 495616 12158 -500 dockerd
May 30 10:01:42 compute03 kernel: [2263822.755849] [ 944] 112 944 28924 1 262144 307 0 zabbix_agentd
May 30 10:01:42 compute03 kernel: [2263822.755850] [ 945] 112 945 29478 11 270336 357 0 zabbix_agentd
May 30 10:01:42 compute03 kernel: [2263822.755852] [ 946] 112 946 29478 0 270336 369 0 zabbix_agentd
May 30 10:01:42 compute03 kernel: [2263822.755853] [ 947] 112 947 29478 12 270336 355 0 zabbix_agentd
May 30 10:01:42 compute03 kernel: [2263822.755854] [ 952] 0 952 3666 0 73728 38 0 agetty
May 30 10:01:42 compute03 kernel: [2263822.755856] [ 954] 112 954 27903 0 258048 360 0 zabbix_agentd
May 30 10:01:42 compute03 kernel: [2263822.755857] [ 958] 0 958 3722 0 77824 36 0 agetty
May 30 10:01:42 compute03 kernel: [2263822.755858] [ 960] 0 960 18075 1 188416 191 -1000 sshd
May 30 10:01:42 compute03 kernel: [2263822.755859] [ 961] 0 961 72221 0 212992 274 0 polkitd
May 30 10:01:42 compute03 kernel: [2263822.755860] [ 6213] 1000 6213 19225 0 196608 346 0 systemd
May 30 10:01:42 compute03 kernel: [2263822.755861] [ 6214] 1000 6214 27956 0 245760 614 0 (sd-pam)
May 30 10:01:42 compute03 kernel: [2263822.755862] [ 6307] 1000 6307 63356 313 385024 12640 0 service
May 30 10:01:42 compute03 kernel: [2263822.755863] [ 3600] 0 3600 26925 0 65536 265 -999 containerd-shim
May 30 10:01:42 compute03 kernel: [2263822.755864] [ 3628] 999 3628 818153 332342 6262784 394513 0 python
May 30 10:01:42 compute03 kernel: [2263822.755865] [ 3703] 0 3703 26925 0 73728 271 -999 containerd-shim
May 30 10:01:42 compute03 kernel: [2263822.755875] [ 3732] 999 3732 818151 288134 6258688 438719 0 python
May 30 10:01:42 compute03 kernel: [2263822.755876] [ 4172] 0 4172 26925 0 73728 271 -999 containerd-shim
May 30 10:01:42 compute03 kernel: [2263822.755878] [ 4196] 999 4196 324489 77683 2314240 156754 0 python
May 30 10:01:42 compute03 kernel: [2263822.755879] [ 4332] 0 4332 27277 0 77824 318 -999 containerd-shim
May 30 10:01:42 compute03 kernel: [2263822.755880] [ 4362] 999 4362 286331 192099 2007040 4441 0 python
May 30 10:01:42 compute03 kernel: [2263822.755881] [ 4431] 0 4431 26925 0 73728 243 -999 containerd-shim
May 30 10:01:42 compute03 kernel: [2263822.755882] [ 4460] 999 4460 152545 57219 913408 5807 0 python
May 30 10:01:42 compute03 kernel: [2263822.755883] [ 4515] 1000 4515 354203 0 565248 13231 0 service
May 30 10:01:42 compute03 kernel: [2263822.755884] Out of memory: Kill process 3628 (python) score 353 or sacrifice child
May 30 10:01:42 compute03 kernel: [2263822.757606] Killed process 3628 (python) total-vm:3272612kB, anon-rss:1329368kB, file-rss:0kB, shmem-rss:0kB
May 30 10:01:42 compute03 kernel: [2263822.899423] oom_reaper: reaped process 3628 (python), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
正如我所见,docker 有-500 oom 分数但是下一次尝试(30 分钟后)表中却没有出现 docker 的身影。
之前搜索到的所有单词码头工人单词只是信息日志。在开始之前没有错误。
答案1
您是否检查过服务是否因内存问题而被终止。如果系统内存不足,即 RAM 或交换空间已满,Linux out_of_memory 会自动终止进程,请运行以下命令
grep docker /var/log/kern.log
如果不可用,请查看 /var/log/messages
这只是一个假设