我遇到了一个奇怪的问题,从一个虚拟机 (A) 到另一个虚拟机 (B) 的连接系统性地超时,双向都是如此。症状:
ping A
防止B
挂起(以及我尝试过的任何其他东西,例如 ssh、http、netcat 等)ping B
挂起A
并崩溃ping: Do you want to ping broadcast? Then -b. If not, check your local firewall rules
ping -b B
从A
挂起,就像在步骤 1 中一样
A 和 B 是在同一 Proxmox 节点上运行的 ubuntu VM。两者都可以通过互联网成功访问其他主机。除了静态 IP(最后一位数字会发生变化)外,它们的网络配置在我看来完全相同。
现在,这让我感到困惑。我在同一个 proxmox 节点上有第三个 ubuntu VM (C),并且:
- 从 C 到 A 以及从 A 到 C 的 ping 操作均正常。
- 从 C 到 B 以及从 B 到 C 的 ping 操作均正常。
每个虚拟机的网络配置如下所示:
服务器 A 配置
# This is the network config written by 'subiquity'
network:
ethernets:
ens18:
dhcp4: no
addresses: [xx.xx.xx.224/28]
routes:
- to: default
via: xx.xx.xx.254
on-link: true
nameservers:
addresses: [1.1.1.1]
search: []
version: 2
agate@server-A:~$ ifconfig
ens18: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet xx.xx.xx.224 netmask 255.255.255.240 broadcast xx.xx.xx.239
inet6 fe80::ff:feca:9d50 prefixlen 64 scopeid 0x20<link>
ether 02:00:00:ca:9d:50 txqueuelen 1000 (Ethernet)
RX packets 44792791 bytes 147553446480 (147.5 GB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 42577292 bytes 41365334680 (41.3 GB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
服务器 B 配置
agate@server-B:~$ cat /etc/netplan/00-installer-config.yaml
# This is the network config written by 'subiquity'
network:
ethernets:
ens18:
dhcp4: no
addresses: [xx.xx.xx.226/28]
routes:
- to: default
via: xx.xx.xx.254
on-link: true
nameservers:
addresses: [1.1.1.1]
search: []
version: 2
agate@server-B:~$ ifconfig
ens18: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet xx.xx.xx.226 netmask 255.255.255.240 broadcast xx.xx.xx.239
inet6 fe80::ff:fe4f:96ab prefixlen 64 scopeid 0x20<link>
ether 02:00:00:4f:96:ab txqueuelen 1000 (Ethernet)
RX packets 102583 bytes 8681631 (8.6 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 219135 bytes 218634003 (218.6 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
服务器 C 配置
agate@server-C:~$ cat /etc/netplan/00-installer-config.yaml
# This is the network config written by 'subiquity'
network:
ethernets:
ens18:
dhcp4: no
addresses: [xx.xx.xx.225/28]
routes:
- to: default
via: xx.xx.xx.254
on-link: true
nameservers:
addresses: [1.1.1.1]
search: []
version: 2
agate@server-C:~$ ifconfig
ens18: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet xx.xx.xx.225 netmask 255.255.255.240 broadcast xx.xx.xx.239
inet6 fe80::ff:fec3:f9e5 prefixlen 64 scopeid 0x20<link>
ether 02:00:00:c3:f9:e5 txqueuelen 1000 (Ethernet)
RX packets 103612 bytes 19973364 (19.9 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 35103 bytes 159749390 (159.7 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
ping 和 tcpdump 结果
我尝试缩小问题范围,但我不擅长网络。如果我在 上使用 tcpdump B
,并尝试从 执行 ping 操作A
,我会得到以下结果:
agate@server-A:~$ ping xx.xx.xx.226
PING xx.xx.xx.226 56(84) bytes of data.
# hangs
--- xx.xx.xx.226 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2038ms
# xx.xx.xx.224 is server A ip address
agate@server-B:~$ sudo tcpdump -n "src xx.xx.xx.224"
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens18, link-type EN10MB (Ethernet), capture size 262144 bytes
03:40:20.261054 IP xx.xx.xx.224 > xx.xx.xx.226: ICMP echo request, id 11, seq 1, length 64
03:40:21.279229 IP xx.xx.xx.224 > xx.xx.xx.226: ICMP echo request, id 11, seq 2, length 64
03:40:22.303190 IP xx.xx.xx.224 > xx.xx.xx.226: ICMP echo request, id 11, seq 3, length 64
因此看起来数据包是从 到达A
的B
。但是,反向执行相同的操作(tcpdump on B
,ping from A
)会产生不同的结果:
agate@server-B:~$ ping xx.xx.xx.224
ping: Do you want to ping broadcast? Then -b. If not, check your local firewall rules
使用ping -b xx.xx.xx.224
生成以下tcpdump
输出A
(并挂在 B 上,数据包丢失率为 100%):
# xx.xx.xx.226 is server B ip address
agate@apps:~$ sudo tcpdump -n "src xx.xx.xx.226"
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on ens18, link-type EN10MB (Ethernet), snapshot length 262144 bytes
03:46:31.369889 IP xx.xx.xx..226 > xx.xx.xx..224: ICMP echo request, id 5, seq 1, length 64
03:46:32.376541 IP xx.xx.xx..226 > xx.xx.xx..224: ICMP echo request, id 5, seq 2, length 64
03:46:33.450877 IP xx.xx.xx..226 > xx.xx.xx..224: ICMP echo request, id 5, seq 3, length 64
由于从 B 到 A 的 ping 操作会生成与防火墙相关的消息,而在相反方向则不会出现该消息,因此我目前的猜测是 B 上的某些防火墙配置会断开 A 和 B 之间的连接。
我尝试sudo ufw disable
在 A 和 B 上禁用 ufw(通过),看看是否有什么变化。结果没有。我该怎么做才能进一步查明问题?