docker-compose.yml
我最近通过一个文件在 Ubuntu Server 22.04 上设置了一个 pihole 容器,network_mode: 'host'
以便流量可以正确路由到容器内运行的 UDP 服务。这在 5 分钟到几天的时间内有效,但随后其他本地网络主机无法解析 DNS。它似乎只影响 UDP 流量,pihole 基于 HTTP 的 Web 管理界面仍然可以访问并继续刷新。
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.2 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.2 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
# More info at https://github.com/pi-hole/docker-pi-hole/ and https://docs.pi-hole.net/
version: "3"
services:
pihole:
container_name: pihole
environment:
BLOCKING_ENABLED: 'true'
INSTALL_WEB_INTERFACE: 'true'
PIHOLE_DNS_: '61.9.211.1;61.9.211.33'
INTERFACE: 'enp2s0f0'
QUERY_LOGGING: 'true'
TZ: 'Australia/Brisbane'
WEBPASSWORD: 'NoARealPassword'
image: pihole/pihole:latest
# # For DHCP it is recommended to remove these ports and instead add: network_mode: "host"
# ports:
# - "53:53/tcp"
# - "53:53/udp"
# - "67:67/udp" # Only required if you are using Pi-hole as your DHCP server
# - "80:80/tcp"
network_mode: 'host'
cap_add:
# Required if you are using Pi-hole as your DHCP server.
# REF: https://github.com/pi-hole/docker-pi-hole#note-on-capabilities
- NET_ADMIN
# Causes the container to start after reboot - unless it was manually stopped.
# REF: https://docs.docker.com/config/containers/start-containers-automatically/
restart: unless-stopped
volumes:
- './etc-pihole:/etc/pihole'
- './etc-dnsmasq.d:/etc/dnsmasq.d'
在不同的本地网络主机上,我正在使用以下命令监控 DNS 查找:
while ( sleep 10s ); do
date
dig +short +retry=0 +time=1 @192.168.47.3 superuser.com
done
当 DNS 查找失败时,我会看到输出:
;; connection timed out; no servers could be reached
但是,我可以通过 SSH 进入 Docker 主机并从那里执行 DNS 查找:
$ dig +short +retry=0 +time=1 @192.168.47.3 superuser.com
151.101.129.69
151.101.193.69
151.101.65.69
151.101.1.69
这表明 Docker 容器及其服务至少已启动并正在运行,并且 Docker UDP 网络部分正常运行。搜索后,我偶然发现了此 Github 票证,其中有一个类似但不同的 UDP 问题:
但是,它确实提到了该conntrack
工具,因此我安装了该工具,sudo apt-get install --no-install-recommends --yes conntrack
并且可以看到 UDP 状态得到维护,尽管您通常可以在输出中看到一个或两个 UNREPLIED 条目:
$ sudo conntrack -L -p udp
udp 17 14 src=192.168.47.3 dst=61.9.211.33 sport=48654 dport=53 src=61.9.211.33 dst=192.168.47.3 sport=53 dport=48654 use=1
udp 17 14 src=192.168.47.31 dst=192.168.47.3 sport=56688 dport=53 src=192.168.47.3 dst=192.168.47.31 sport=53 dport=56688 use=1
udp 17 15 src=127.0.0.1 dst=127.0.0.1 sport=48488 dport=48488 [UNREPLIED] src=127.0.0.1 dst=127.0.0.1 sport=48488 dport=48488 use=1
udp 17 15 src=127.0.0.1 dst=127.0.0.1 sport=53642 dport=53 src=127.0.0.1 dst=127.0.0.1 sport=53 dport=53642 use=1
udp 17 24 src=192.168.47.31 dst=192.168.47.3 sport=65131 dport=53 src=192.168.47.3 dst=192.168.47.31 sport=53 dport=65131 use=1
udp 17 4 src=192.168.47.31 dst=192.168.47.3 sport=60055 dport=53 src=192.168.47.3 dst=192.168.47.31 sport=53 dport=60055 use=1
conntrack v1.4.6 (conntrack-tools): 6 flow entries have been shown.
当本地网络主机停止解析 DNS 时,您会看到容器进程仍在监听,但基本上所有 conntrack 条目都处于未答复状态:
$ systemctl status docker.service
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/docker.service.d
└─override.conf
Active: active (running) since Sat 2023-06-03 00:52:22 UTC; 1h 1min ago
TriggeredBy: ● docker.socket
Docs: https://docs.docker.com
Main PID: 897 (dockerd)
Tasks: 15
Memory: 97.0M
CPU: 3.593s
CGroup: /system.slice/docker.service
└─897 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
Jun 03 00:52:15 pihole systemd[1]: Starting Docker Application Container Engine...
Jun 03 00:52:17 pihole dockerd[897]: time="2023-06-03T00:52:17.390901292Z" level=info msg="Starting up"
Jun 03 00:52:18 pihole dockerd[897]: time="2023-06-03T00:52:18.176864074Z" level=info msg="[graphdriver] using prior storage driver: overlay2"
Jun 03 00:52:18 pihole dockerd[897]: time="2023-06-03T00:52:18.431091140Z" level=info msg="Loading containers: start."
Jun 03 00:52:19 pihole dockerd[897]: time="2023-06-03T00:52:19.751389637Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Jun 03 00:52:21 pihole dockerd[897]: time="2023-06-03T00:52:21.378339153Z" level=info msg="Loading containers: done."
Jun 03 00:52:21 pihole dockerd[897]: time="2023-06-03T00:52:21.696188540Z" level=info msg="Docker daemon" commit=659604f graphdriver=overlay2 version=24.0.2
Jun 03 00:52:21 pihole dockerd[897]: time="2023-06-03T00:52:21.722458798Z" level=info msg="Daemon has completed initialization"
Jun 03 00:52:22 pihole dockerd[897]: time="2023-06-03T00:52:22.047272241Z" level=info msg="API listen on /run/docker.sock"
Jun 03 00:52:22 pihole systemd[1]: Started Docker Application Container Engine.
$ sudo netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 1344/lighttpd
tcp 0 0 0.0.0.0:53 0.0.0.0:* LISTEN 1360/pihole-FTL
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 860/sshd: /usr/sbin
tcp 0 0 127.0.0.1:4711 0.0.0.0:* LISTEN 1360/pihole-FTL
tcp6 0 0 :::80 :::* LISTEN 1344/lighttpd
tcp6 0 0 :::53 :::* LISTEN 1360/pihole-FTL
tcp6 0 0 :::22 :::* LISTEN 860/sshd: /usr/sbin
tcp6 0 0 ::1:4711 :::* LISTEN 1360/pihole-FTL
udp 0 0 0.0.0.0:53 0.0.0.0:* 1360/pihole-FTL
udp 0 0 0.0.0.0:67 0.0.0.0:* 1360/pihole-FTL
udp 0 0 192.168.47.4:68 0.0.0.0:* 743/systemd-network
udp6 0 0 :::53 :::* 1360/pihole-FTL
$ sudo conntrack -L -p udp
udp 17 17 src=127.0.0.1 dst=127.0.0.1 sport=54075 dport=53 src=127.0.0.1 dst=127.0.0.1 sport=53 dport=54075 use=1
udp 17 3 src=192.168.47.31 dst=192.168.47.3 sport=51427 dport=53 [UNREPLIED] src=192.168.47.3 dst=192.168.47.31 sport=53 dport=51427 use=1
udp 17 17 src=127.0.0.1 dst=127.0.0.1 sport=39832 dport=39832 [UNREPLIED] src=127.0.0.1 dst=127.0.0.1 sport=39832 dport=39832 use=1
udp 17 14 src=192.168.47.31 dst=192.168.47.3 sport=62519 dport=53 [UNREPLIED] src=192.168.47.3 dst=192.168.47.31 sport=53 dport=62519 use=1
udp 17 16 src=192.168.47.31 dst=192.168.47.3 sport=44415 dport=53 [UNREPLIED] src=192.168.47.3 dst=192.168.47.31 sport=53 dport=44415 use=1
udp 17 16 src=192.168.47.31 dst=192.168.47.3 sport=35812 dport=53 [UNREPLIED] src=192.168.47.3 dst=192.168.47.31 sport=53 dport=35812 use=1
udp 17 25 src=192.168.47.31 dst=192.168.47.3 sport=58360 dport=53 [UNREPLIED] src=192.168.47.3 dst=192.168.47.31 sport=53 dport=58360 use=1
udp 17 16 src=192.168.47.31 dst=192.168.47.3 sport=2478 dport=53 [UNREPLIED] src=192.168.47.3 dst=192.168.47.31 sport=53 dport=2478 use=1
udp 17 16 src=192.168.47.31 dst=192.168.47.3 sport=59483 dport=53 [UNREPLIED] src=192.168.47.3 dst=192.168.47.31 sport=53 dport=59483 use=1
udp 17 24 src=192.168.47.31 dst=192.168.47.3 sport=55490 dport=53 [UNREPLIED] src=192.168.47.3 dst=192.168.47.31 sport=53 dport=55490 use=1
conntrack v1.4.6 (conntrack-tools): 10 flow entries have been shown.
重新启动 pihole 容器似乎没有帮助,重新启动 Docker 服务也没有帮助,似乎有效的只是重新启动整个 Docker 主机:
$ docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
6a601dafd862 pihole/pihole:latest "/s6-init" 11 days ago Up 30 minutes (healthy) pihole
$ docker container restart 6a601dafd862
6a601dafd862
# time passes, no improvement, so...
$ sudo systemctl restart docker.service
# time passes, no improvement, so...
$ sudo reboot
我该如何进一步诊断这个问题?我该如何从一开始就预防这个问题?
(除了监控主机通过 SSH 强制重新启动 Docker 主机。我正在考虑。或者从每小时一次的 corontab 重新启动。)