我在 VPN 内有一个由一个主节点和三个节点组成的 Kubernetes 集群,它显示就绪状态。它是使用 kubeadm 和 flannel 构建的。VPN 网络范围为 192.168.1.0/16。
$ kubectl 获取节点 -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8-master Ready master 144d v1.17.0 192.168.1.132 <none> Ubuntu 18.04.3 LTS 4.15.0-72-generic docker://18.9.7
k8-n1 Ready <none> 144d v1.17.0 192.168.1.133 <none> Ubuntu 18.04.3 LTS 4.15.0-72-generic docker://18.9.7
k8-n2 Ready <none> 144d v1.17.0 192.168.1.134 <none> Ubuntu 18.04.3 LTS 4.15.0-72-generic docker://18.9.7
k8-n3 Ready <none> 144d v1.17.0 192.168.1.135 <none> Ubuntu 18.04.3 LTS 4.15.0-72-generic docker://18.9.7
我可以到达节点。
$ ping 192.168.1.133
PING 192.168.1.133 (192.168.1.133) 56(84) bytes of data.
64 bytes from 192.168.1.133: icmp_seq=1 ttl=64 time=0.219 ms
64 bytes from 192.168.1.133: icmp_seq=2 ttl=64 time=0.246 ms
64 bytes from 192.168.1.133: icmp_seq=3 ttl=64 time=0.199 ms
64 bytes from 192.168.1.133: icmp_seq=4 ttl=64 time=0.209 ms
^X^C
--- 192.168.1.133 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3071ms
rtt min/avg/max/mdev = 0.199/0.218/0.246/0.020 ms
$ ping 192.168.1.134
PING 192.168.1.134 (192.168.1.134) 56(84) bytes of data.
64 bytes from 192.168.1.134: icmp_seq=1 ttl=64 time=0.288 ms
64 bytes from 192.168.1.134: icmp_seq=2 ttl=64 time=0.272 ms
64 bytes from 192.168.1.134: icmp_seq=3 ttl=64 time=0.268 ms
^C
--- 192.168.1.134 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2032ms
rtt min/avg/max/mdev = 0.268/0.276/0.288/0.008 ms
$ ping 192.168.1.135
PING 192.168.1.135 (192.168.1.135) 56(84) bytes of data.
64 bytes from 192.168.1.135: icmp_seq=1 ttl=64 time=0.278 ms
64 bytes from 192.168.1.135: icmp_seq=2 ttl=64 time=0.221 ms
64 bytes from 192.168.1.135: icmp_seq=3 ttl=64 time=0.181 ms
^C
--- 192.168.1.135 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2030ms
但我设置了 nginx 2 pods 部署来测试它是否有效
nginx-deployment-574b87c764-2gz8t 1/1 Running 0 25m 192.168.2.12 k8-n2 <none> <none>
nginx-deployment-574b87c764-rst8x 1/1 Running 0 25m 192.168.1.17 k8-n1 <none> <none>
$ kubectl 获取服务
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 3d17h
nginx-deployment NodePort 10.96.211.211 <none> 80:31577/TCP 13s
我无法连接它。
$ curl k8-n1:31577
curl: (7) Failed to connect to k8-n1 port 31577: Connection refused
$ curl k8-n2:31577
curl: (7) Failed to connect to k8-n2 port 31577: Connection refused
$ curl k8-n3:31577
curl: (7) Failed to connect to k8-n3 port 31577: Connection refused
$ curl 10.96.211.211:80
curl: (7) Failed to connect to 10.96.211.211 port 80: Connection refused
$ curl 192.168.1.17:80
curl: (7) Failed to connect to 192.168.1.17 port 80: No route to host
$ curl 192.168.1.17:31577
curl: (7) Failed to connect to 192.168.1.17 port 31577: No route to host
$ curl 192.168.1.133:31577
curl: (7) Failed to connect to 192.168.1.133 port 31577: Connection refused
$ curl 192.168.1.133:6443
curl: (7) Failed to connect to 192.168.1.133 port 6443: Connection refused
我变了:
sudo kubeadm init --pod-network-cidr=192.168.1.0/16 --apiserver-advertise-address=192.168.1.132
我将 flannel.yaml 网络更改为 192.168.1.0/16
kubectl edit cm -n kube-system kube-flannel-cfg
重启后 core-dns pod 描述:
Normal Scheduled 109s default-scheduler Successfully assigned kube-system/coredns-6955765f44-vwqgm to k8-n1
Normal Pulled 106s kubelet, k8-n1 Container image "k8s.gcr.io/coredns:1.6.5" already present on machine
Normal Created 105s kubelet, k8-n1 Created container coredns
Normal Started 105s kubelet, k8-n1 Started container coredns
Warning Unhealthy 3s (x11 over 103s) kubelet, k8-n1 Readiness probe failed: Get http://192.168.1.19:8181/ready: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 1s (x5 over 41s) kubelet, k8-n1 Liveness probe failed: Get http://192.168.1.19:8080/health: dial tcp 192.168.1.19:8080: connect: no route to host
Normal Killing 1s kubelet, k8-n1 Container coredns failed liveness probe, will be restarted
我将非常感激任何帮助或询问更多信息。
答案1
在检查问题时,我注意到 OP 使用 CIDR 初始化集群,192.168.1.0/16
该 CIDR 与该节点 IP 地址重叠,然后导致coreDNS
pod 出现问题。
使用新的不同初始化集群CIDR
解决了该问题。