无法从 Pod 访问集群虚拟 IP，但可以从工作节点访问

2024-6-1 • tag-icon

kubernetes

我遇到了一个问题，即 Pod 无法与我的 Kubernetes 集群中的集群 IP（Pod 前端的虚拟 IP）进行“对话”。

我一直在关注 Kelsey Hightower 的《Kubernetes 艰难之路》，然而我已将其全部转换为在 AWS 中运行基础设施。

我几乎所有的东西都正常工作，但我遇到了一个问题，那就是我的 pod 无法与 clusterIP 虚拟 IP 通信。

service-cluster-ip-range 为：10.32.0.0/24
工作节点的 Pod CIDR 为： 10.200.0.0/16

我最初尝试使用 CoreDNS 和 Kube-dns，认为这可能是该级别的问题，但是后来我诊断出事实是我无法从 pod 与服务集群 IP 通信，但在实际的工作节点上我确实可以与集群 IP 通信。

我已经验证了它kube-proxy按预期工作。我在模式下运行它iptables，可以看到它在工作节点上正确地写出 iptables 规则。我甚至尝试切换到ipvs模式，在该模式下它也能正确写出规则。

如果我在测试 pod（例如 busybox 1.28）内执行 nslookup，并让其使用指向我的 coredns 安装的标准名称服务器设置，则无法解析google.com or the clusterkubernetes.default`。但是，如果我告诉 nslookup 使用 coredns pod 的 POD IP 地址，它就可以正常工作。

例子

这不起作用：

kubectl exec -it busybox -- nslookup google.com               
Server:    10.32.0.10
Address 1: 10.32.0.10

nslookup: can't resolve 'google.com'
command terminated with exit code 1

这是有效的（将 nslookup 指向 coredns pod IP 地址而不是集群 IP）：

kubectl exec -it busybox -- nslookup google.com 10.200.2.2                   
Server:    10.200.2.2
Address 1: 10.200.2.2 kube-dns-67d45fcb87-2h2dz

Name:      google.com
Address 1: 2607:f8b0:4004:810::200e iad23s63-in-x0e.1e100.net
Address 2: 172.217.164.142 iad30s24-in-f14.1e100.net

为了澄清起见，我尝试使用 CoreDNS 和 kube-dns - 两种情况下的结果相同。这似乎是更高级别的网络问题。

我的 AWS EC2 实例已禁用源/目标检查。我的所有配置和设置都是从官方 kubernetes-the-hard-way 仓库分叉而来的，但我已更新了可在 AWS 上运行的内容。包含我所有配置/设置等的源代码是这里

编辑：提供/etc/resolv.conf我的 pod 从 kube-dns / coredns 获取的信息（不过这看起来完全没问题）：

# cat /etc/resolv.conf
search kube-system.svc.cluster.local svc.cluster.local cluster.local ec2.internal
nameserver 10.32.0.10
options ndots:5

我能够直接从 pod ping kube-dns pod IP，但 kube-dns 的集群 IP 不适用于 ping 或其他任何操作。（其他具有集群 IP 的服务也是如此）。例如

me@mine ~/Git/kubernetes-the-hard-way/test kubectl get pods -n kube-system -o wide
NAME                                READY   STATUS    RESTARTS   AGE    IP            NODE             NOMINATED NODE   READINESS GATES
hello-node1-55cc74b4b8-2hh4w        1/1     Running   2          3d1h   10.200.2.14   ip-10-240-0-22   <none>           <none>
hello-node2-66b5494599-cw8hx        1/1     Running   2          3d1h   10.200.2.12   ip-10-240-0-22   <none>           <none>
kube-dns-67d45fcb87-2h2dz           3/3     Running   6          3d1h   10.200.2.11   ip-10-240-0-22   <none>           <none>

 me@mine ~/Git/kubernetes-the-hard-way/test kubectl exec -it hello-node1-55cc74b4b8-2hh4w sh
Error from server (NotFound): pods "hello-node1-55cc74b4b8-2hh4w" not found
 me@mine ~/Git/kubernetes-the-hard-way/test kubectl -n kube-system exec -it hello-node1-55cc74b4b8-2hh4w sh
# ping 10.200.2.11
PING 10.200.2.11 (10.200.2.11) 56(84) bytes of data.
64 bytes from 10.200.2.11: icmp_seq=1 ttl=64 time=0.080 ms
64 bytes from 10.200.2.11: icmp_seq=2 ttl=64 time=0.044 ms
64 bytes from 10.200.2.11: icmp_seq=3 ttl=64 time=0.045 ms
^C
--- 10.200.2.11 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.044/0.056/0.080/0.017 ms

# ip route get 10.32.0.10
10.32.0.10 via 10.200.2.1 dev eth0  src 10.200.2.14
    cache
#

我是否忽略了这里一些明显的东西？

答案1

遇到了完全相同的问题，解决方案如下：

modprobe br_netfilter
sysctl net.bridge.bridge-nf-call-iptables=1

答案2

尝试将以下内容添加到 kube-dns ConfigMap

data:
  upstreamNameservers: |
    [“8.8.8.8”, “8.8.4.4”]

答案1

答案2

相关内容