我在 Fedora Linux KVM 虚拟化环境中使用 kubeadm 设置了 1 个主节点和 1 个从节点 kubernetes 集群,pod cidr 范围为 10.244.0.0/16。使用 flannel 网络策略。
主节点:主机名 - fedkubemaster ip 地址 - 192.168.122.161 工作节点:主机名 - fedkubenode ip 地址 - 192.168.122.27 (注意 - 我的主机 FQDN 无法通过 DNS 解析)
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
fedkubemaster Ready control-plane,master 2d20h v1.23.3 192.168.122.161 <none> Fedora Linux 35 (Workstation Edition) 5.15.16-200.fc35.x86_64 docker://20.10.12
fedkubenode Ready <none> 2d6h v1.23.3 192.168.122.27 <none> Fedora Linux 35 (Workstation Edition) 5.15.16-200.fc35.x86_64 docker://20.10.12
这是我从主节点到工作节点的路线
[admin@fedkubemaster ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.122.1 0.0.0.0 UG 100 0 0 enp1s0
10.244.0.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0
10.244.1.0 10.244.1.0 255.255.255.0 UG 0 0 0 flannel.1
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
172.18.0.0 0.0.0.0 255.255.0.0 U 0 0 0 br-25b1faebd814
192.168.122.0 0.0.0.0 255.255.255.0 U 100 0 0 enp1s0
[admin@fedkubenode ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.122.1 0.0.0.0 UG 100 0 0 enp1s0
10.244.0.0 10.244.0.0 255.255.255.0 UG 0 0 0 flannel.1
10.244.1.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
192.168.122.0 0.0.0.0 255.255.255.0 U 100 0 0 enp1s0
我正在使用这个 dnsutil pod yml 定义来测试我与主机的连接
apiVersion: v1
kind: Pod
metadata:
name: dnsutils
namespace: default
spec:
containers:
- name: dnsutils
image: k8s.gcr.io/e2e-test-images/jessie-dnsutils:1.3
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
这是我的 dnsutils pod 中的 ip addr 和 ip route show 输出。
root@dnsutils:/# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
link/ether 7a:50:37:bc:4b:45 brd ff:ff:ff:ff:ff:ff
inet 10.244.1.2/24 brd 10.244.1.255 scope global eth0
valid_lft forever preferred_lft forever
root@dnsutils:/#
root@dnsutils:/# ip route show
default via 10.244.1.1 dev eth0
10.244.0.0/16 via 10.244.1.1 dev eth0
10.244.1.0/24 dev eth0 proto kernel scope link src 10.244.1.2
我尝试执行 nslookup 并 ping 主机 FQDN,但无法解析。然后我尝试使用它们各自的 IP 地址执行 ping,其中主节点显示输出为数据包已过滤,而工作节点能够使用 IP 地址进行响应。
root@dnsutils:/# nslookup fedkubemaster
;; connection timed out; no servers could be reached
root@dnsutils:/# nslookup fedkubenode
;; connection timed out; no servers could be reached
root@dnsutils:/# ping fedkubemaster
ping: unknown host fedkubemaster
root@dnsutils:/# ping fedkubenode
ping: unknown host fedkubenode
root@dnsutils:/# ping 192.168.122.161
PING 192.168.122.161 (192.168.122.161) 56(84) bytes of data.
From 10.244.1.1 icmp_seq=1 Packet filtered
From 10.244.1.1 icmp_seq=2 Packet filtered
^C
--- 192.168.122.161 ping statistics ---
2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1013ms
root@dnsutils:/# ping 192.168.122.27
PING 192.168.122.27 (192.168.122.27) 56(84) bytes of data.
64 bytes from 192.168.122.27: icmp_seq=1 ttl=64 time=0.286 ms
64 bytes from 192.168.122.27: icmp_seq=2 ttl=64 time=0.145 ms
问题是我想让我的主机 FQDN 可以从 Pod 内部解析,但我不明白如何修复它。似乎没有从 Pod 内部解析主机 FQDN 的途径,这也反映在 coredns 日志中。这是错误。
[admin@fedkubemaster networkutils]$ kubectl logs -f coredns-64897985d-8skq2 -n kube-system
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.8.6
linux/amd64, go1.17.1, 13a9191
[ERROR] plugin/errors: 2 2603559064493035223.1593267795798361043. HINFO: read udp 10.244.0.2:38440->192.168.122.1:53: read: no route to host
[ERROR] plugin/errors: 2 2603559064493035223.1593267795798361043. HINFO: read udp 10.244.0.2:34275->192.168.122.1:53: read: no route to host
我正在尝试弄清楚是否有任何方法可以默认将路由添加到 pod,但我对它们不太熟悉,无法修复它。
请提出建议。如果需要任何其他详细信息,请告知我。
谢谢 Sudhir
答案1
我能够通过暂时禁用主服务器和工作服务器上的防火墙服务来解决我的问题。
[admin@fedkubemaster ~]$ sudo systemctl stop firewalld.service
[admin@fedkubemaster ~]$ sudo systemctl disable firewalld.service
[admin@fedkubenode ~]$ sudo systemctl stop firewalld.service
[admin@fedkubenode ~]$ sudo systemctl disable firewalld.service
但我仍然需要理解的是,为什么尽管所需端口按照 kubernetes 文档,启用它会导致此问题。