Docker - 无出站流量/网桥仅在混杂模式下有效

Docker - 无出站流量/网桥仅在混杂模式下有效

过去一周,我一直在努力解决一个非常奇怪的网络问题。总之,我的容器无法访问互联网,除非我运行tcpdump -i br-XXXXX(这会使网桥进入混杂模式)

我有两个使用 Compose 启动的容器:

version: '3'
services:
  seafile:
    build: ./seafile/build
    container_name: seafile
    restart: always
    ports:
      - 8080:80
    networks:
      seafile_net:
        ipv4_address: 192.168.0.2
    volumes:
      - /mnt/gluster/files/redacted/data:/shared
    environment:
      - DB_HOST=10.200.7.100
      - DB_PASSWD=redacted
      - TIME_ZONE=America/Chicago
    depends_on:
      - seafile-memcached
  seafile-memcached:
    image: memcached:1.5.6
    container_name: seafile-memcached
    restart: always
    networks:
      seafile_net:
        ipv4_address: 192.168.0.3
    entrypoint: memcached -m 256
networks:
  seafile_net:
    driver: bridge
    ipam:
      driver: default
      config:
        - subnet: 192.168.0.0/24

容器正在运行:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                  NAMES
93b1b773ad4e        docker_seafile      "/sbin/my_init -- /s…"   2 minutes ago       Up 2 minutes        0.0.0.0:8080->80/tcp   seafile
1f6b124c3be4        memcached:1.5.6     "memcached -m 256"       2 minutes ago       Up 2 minutes        11211/tcp              seafile-memcached

网络信息:

$ docker network ls
NETWORK ID          NAME                 DRIVER              SCOPE
f67b015c4b84        bridge               bridge              local
d21cb7ba8ee4        docker_seafile_net   bridge              local
d0eb86ca57fa        host                 host                local
01f03fcfa103        none                 null                local

$ docker inspect d21cb7ba8ee4
[
    {
        "Name": "docker_seafile_net",
        "Id": "d21cb7ba8ee4a477497a7d343ea1a5f9b109237dce878a40605a281e1a2db1e9",
        "Created": "2020-09-24T15:03:46.39761472-04:00",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "192.168.0.0/24"
                }
            ]
        },
        "Internal": false,
        "Attachable": true,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "1f6b124c3be414040a6def3b3bc3e9f06e2af6a28afd6737823d1da65d5ab047": {
                "Name": "seafile-memcached",
                "EndpointID": "ab3e3c4aa216d158473fa3dde3f87e654422ffeca6ebb7626d072da10ba9a5cf",
                "MacAddress": "02:42:c0:a8:00:03",
                "IPv4Address": "192.168.0.3/24",
                "IPv6Address": ""
            },
            "93b1b773ad4e3685aa8ff2db2f342c617c42f1c5ab4ce693132c1238e73e705d": {
                "Name": "seafile",
                "EndpointID": "a895a417c22a4755df15b180d1c38b712c36047b01596c370815964a212f7105",
                "MacAddress": "02:42:c0:a8:00:02",
                "IPv4Address": "192.168.0.2/24",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {
            "com.docker.compose.network": "seafile_net",
            "com.docker.compose.project": "docker",
            "com.docker.compose.version": "1.27.4"
        }
    }
]

$ ip link show master br-d21cb7ba8ee4
18: veth8fd88c9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-d21cb7ba8ee4 state UP mode DEFAULT group default
    link/ether b6:37:9e:fd:9e:da brd ff:ff:ff:ff:ff:ff
20: vetheb84e16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-d21cb7ba8ee4 state UP mode DEFAULT group default
    link/ether ca:90:c8:a6:2e:9b brd ff:ff:ff:ff:ff:ff

一旦容器启动,它们就无法访问互联网或主机网络上的任何其他资源。以下curl命令从其中一个容器内部运行。在主机服务器上,此命令运行正常:

root@93b1b773ad4e:/opt/seafile# curl -viLk http://1.1.1.1
* Rebuilt URL to: http://1.1.1.1/
*   Trying 1.1.1.1...
* TCP_NODELAY set
**hangs**

这是网桥的 tcpdump(在主机上运行),未将其置于混杂模式。这是我尝试运行curl上面的命令时捕获的:

$ tcpdump --no-promiscuous-mode -lnni br-d21cb7ba8ee4
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-d21cb7ba8ee4, link-type EN10MB (Ethernet), capture size 262144 bytes
14:15:42.447055 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:43.449058 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:45.448787 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:46.451049 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:47.453058 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:49.449789 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:50.451048 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28

但如果我让tcpdump网桥进入混杂模式,一切就开始正常工作了:

$ tcpdump -lnni br-d21cb7ba8ee4
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-d21cb7ba8ee4, link-type EN10MB (Ethernet), capture size 262144 bytes
14:16:05.457844 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:16:05.457863 ARP, Reply 192.168.0.2 is-at 02:42:c0:a8:00:02, length 28
**traffic continues**

Docker 信息:

$ docker info
Client:
 Debug Mode: false
Server:
 Containers: 2
  Running: 2
  Paused: 0
  Stopped: 0
 Images: 6
 Server Version: 19.03.13
 Storage Driver: devicemapper
  Pool Name: docker-8:3-3801718-pool
  Pool Blocksize: 65.54kB
  Base Device Size: 10.74GB
  Backing Filesystem: xfs
  Udev Sync Supported: true
  Data file: /dev/loop0
  Metadata file: /dev/loop1
  Data loop file: /var/lib/docker/devicemapper/devicemapper/data
  Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
  Data Space Used: 1.695GB
  Data Space Total: 107.4GB
  Data Space Available: 80.76GB
  Metadata Space Used: 3.191MB
  Metadata Space Total: 2.147GB
  Metadata Space Available: 2.144GB
  Thin Pool Minimum Free Space: 10.74GB
  Deferred Removal Enabled: true
  Deferred Deletion Enabled: true
  Deferred Deleted Device Count: 0
  Library Version: 1.02.164-RHEL7 (2019-08-27)
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 8fba4e9a7d01810a393d5d25a3621dc101981175
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 3.10.0-123.9.2.el7.x86_64
 Operating System: CentOS Linux 7 (Core)
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 3.704GiB
 Name: redacted.novalocal
 ID: redacted
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
WARNING: the devicemapper storage-driver is deprecated, and will be removed in a future release.
WARNING: devicemapper: usage of loopback devices is strongly discouraged for production use.
         Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.

主机信息:

$ docker --version
Docker version 19.03.13, build 4484c46d9d

$ docker-compose --version
docker-compose version 1.27.4, build 40524192

$ cat /etc/redhat-release
CentOS Linux release 7.8.2003 (Core)

$ getenforce
Disabled

$ free -h
              total        used        free      shared  buff/cache   available
Mem:           3.7G        1.1G        826M        109M        1.8G        2.2G
Swap:          1.0G        292M        731M

$ nproc
2

$ uptime
 10:39:49 up 3 days, 19:56,  1 user,  load average: 0.00, 0.01, 0.05

$ iptables-save
# Generated by iptables-save v1.4.21 on Mon Sep 28 10:41:22 2020
*filter
:INPUT ACCEPT [17098775:29231856941]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [15623889:13475217196]
:DOCKER - [0:0]
:DOCKER-ISOLATION-STAGE-1 - [0:0]
:DOCKER-ISOLATION-STAGE-2 - [0:0]
:DOCKER-USER - [0:0]
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o br-d21cb7ba8ee4 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o br-d21cb7ba8ee4 -j DOCKER
-A FORWARD -i br-d21cb7ba8ee4 ! -o br-d21cb7ba8ee4 -j ACCEPT
-A FORWARD -i br-d21cb7ba8ee4 -o br-d21cb7ba8ee4 -j ACCEPT
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER -d 192.168.0.2/32 ! -i br-d21cb7ba8ee4 -o br-d21cb7ba8ee4 -p tcp -m tcp --dport 80 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i br-d21cb7ba8ee4 ! -o br-d21cb7ba8ee4 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o br-d21cb7ba8ee4 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
-A DOCKER-USER -j RETURN
COMMIT
# Completed on Mon Sep 28 10:41:22 2020
# Generated by iptables-save v1.4.21 on Mon Sep 28 10:41:22 2020
*nat
:PREROUTING ACCEPT [408634:24674574]
:INPUT ACCEPT [380413:22825327]
:OUTPUT ACCEPT [520596:31263683]
:POSTROUTING ACCEPT [711734:42731963]
:DOCKER - [0:0]
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 192.168.0.0/24 ! -o br-d21cb7ba8ee4 -j MASQUERADE
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A POSTROUTING -s 192.168.0.2/32 -d 192.168.0.2/32 -p tcp -m tcp --dport 80 -j MASQUERADE
-A DOCKER -i br-d21cb7ba8ee4 -j RETURN
-A DOCKER -i docker0 -j RETURN
-A DOCKER ! -i br-d21cb7ba8ee4 -p tcp -m tcp --dport 8080 -j DNAT --to-destination 192.168.0.2:80
COMMIT
# Completed on Mon Sep 28 10:41:22 2020

答案1

感谢@AB 的评论,我找到了解决方案。

我认为主要问题是br_netfilter模块未加载:

$ lsmod | grep br_netfilter
$

在另一个 CentOS 7 Docker 主机上(没有这个问题),该模块被加载:

$ lsmod | grep br_netfilter
br_netfilter           22256  0
bridge                146976  1 br_netfilter

手动加载模块对我来说不起作用:

$ modprobe br_netfilter
modprobe: FATAL: Module br_netfilter not found.

我读这里br_netfilter是内核版本 3.18 之前的内置模块。

我发现我们从更新中排除了内核(我没有设置这个服务器,所以这对我来说是个新闻)。

$ grep exclude /etc/yum.conf
exclude=kernel*

由于这个排除,我之前的yum updates 没有更新内核。我认为分离br_netfilter没有被移植到我们正在运行的内核中。

运行没有内核排除的更新(yum --disableexcludes=all update kernel)并重新启动后,一切都开始正常工作!

内核更新使我从3.10.0-123.9.2.el7.x86_643.10.0-1127.19.1.el7

相关内容