Kubernetes通过Flannel网络组件搭建集群后,出现无法通过Flannel网段互相访问的情况,master节点ping 10.244.2.0(node1节点Flannel网段)超时。
kubernetes 版本 1.24.2
master Flannel虚拟网卡信息
[root@master1 k8s]# ifconfig flannel.1
flannel.1 Link encap:Ethernet HWaddr 56:94:4B:2E:EA:5D
inet addr:10.244.0.0 Bcast:0.0.0.0 Mask:255.255.255.255
inet6 addr: fe80::5494:4bff:fe2e:ea5d/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:988 errors:0 dropped:8 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:58486 (57.1 KiB)
tcpdump捕获的请求日志如下
[root@master1 k8s]# tcpdump -i eth0 port 30000
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
20:47:33.186133 IP 172.16.29.34.60629 > master1.ndmps: Flags [S], seq 902841225, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
20:47:33.203308 IP 172.16.29.34.60630 > master1.ndmps: Flags [S], seq 2938533821, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
20:47:33.438264 IP 172.16.29.34.60633 > master1.ndmps: Flags [S], seq 2972658478, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
20:47:34.200834 IP 172.16.29.34.60629 > master1.ndmps: Flags [S], seq 902841225, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
20:47:34.215896 IP 172.16.29.34.60630 > master1.ndmps: Flags [S], seq 2938533821, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
20:47:34.447110 IP 172.16.29.34.60633 > master1.ndmps: Flags [S], seq 2972658478, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
20:47:36.214314 IP 172.16.29.34.60629 > master1.ndmps: Flags [S], seq 902841225, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
[root@master1 ~]# tcpdump -i flannel.1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on flannel.1, link-type EN10MB (Ethernet), capture size 262144 bytes
20:47:33.186218 IP master1.63168 > 10.244.2.3.pcsync-https: Flags [S], seq 902841225, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
20:47:33.203355 IP master1.47411 > 10.244.2.3.pcsync-https: Flags [S], seq 2938533821, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
20:47:33.438341 IP master1.37028 > 10.244.2.3.pcsync-https: Flags [S], seq 2972658478, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
20:47:34.200885 IP master1.63168 > 10.244.2.3.pcsync-https: Flags [S], seq 902841225, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
20:47:34.215930 IP master1.47411 > 10.244.2.3.pcsync-https: Flags [S], seq 2938533821, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
20:47:34.447159 IP master1.37028 > 10.244.2.3.pcsync-https: Flags [S], seq 2972658478, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
路由信息
[root@master1 k8s]# ip route
default via 172.16.103.254 dev eth0 proto static metric 100
10.244.1.0/24 via 10.244.1.0 dev flannel.1 onlink
10.244.2.0/24 via 10.244.2.0 dev flannel.1 onlink
172.16.103.0/24 dev eth0 proto kernel scope link src 172.16.103.66 metric 100
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
ARP 和网桥信息
[root@master1 k8s]# arp -n
? (10.244.2.0) at 3e:18:6e:82:70:79 [ether] PERM on flannel.1
? (172.16.103.253) at 08:4f:0a:70:6d:99 [ether] on eth0
? (172.16.103.251) at 14:96:2d:4d:3d:0f [ether] on eth0
? (10.244.1.0) at 0a:31:9f:e4:b3:14 [ether] PERM on flannel.1
? (172.16.103.68) at fe:fc:fe:ff:13:65 [ether] on eth0
? (172.16.103.67) at fe:fc:fe:ff:ee:0c [ether] on eth0
? (172.16.103.254) at 34:00:a3:3f:26:f9 [ether] on eth0
[root@master1 k8s]# bridge fdb | grep 3e:18:6e:82:70:79
3e:18:6e:82:70:79 dev flannel.1 dst 172.16.103.67 self permanent
并且 master 节点可以访问 node 节点的真实 IP 地址(172.16.103.67)(应该可以排除安全组和防火墙问题),kube channel、kube proxy 和 coredns 均无错误信息。
我们应该如何解决或找到原因