kube-proxy 使用 iptables 在 Kubernetes UDP 服务中进行错误的地址转换

kube-proxy 使用 iptables 在 Kubernetes UDP 服务中进行错误的地址转换

为了教学目的,我按照以下方法创建了一个 Kubernetes 集群:Kubernetes 的艰难之路指南 - 除了我是在我自己的一组本地虚拟机上进行的,而不是使用 Google Cloud。

一切似乎都运行良好,直到我注意到一些网络通信问题。具体来说,DNS 解析似乎并不总是有效。

我已经安装了 CoreDNS掌舵图

helm repo add coredns https://coredns.github.io/helm
helm install -n kube-system coredns coredns/coredns \
  --set service.clusterIP=10.32.0.10,replicaCount=2

这是我的集群的视图:

$ kubectl get nodes -o wide
NAME      STATUS   ROLES    AGE    VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
worker0   Ready    <none>   5d9h   v1.27.3   192.168.64.20   <none>        Ubuntu 22.04.2 LTS   5.15.0-76-generic   containerd://1.7.2
worker1   Ready    <none>   5d9h   v1.27.3   192.168.64.21   <none>        Ubuntu 22.04.2 LTS   5.15.0-76-generic   containerd://1.7.2
worker2   Ready    <none>   5d9h   v1.27.3   192.168.64.22   <none>        Ubuntu 22.04.2 LTS   5.15.0-76-generic   containerd://1.7.2

$ kubectl get pod -A -o wide
NAMESPACE         NAME                                                              READY   STATUS    RESTARTS      AGE     IP            NODE      NOMINATED NODE   READINESS GATES
default           debu                                                              1/1     Running   0             13h     10.200.2.17   worker2   <none>           <none>
kube-system       coredns-coredns-7bbdc98b98-v6qtk                                  1/1     Running   0             42s     10.200.2.18   worker2   <none>           <none>
kube-system       coredns-coredns-7bbdc98b98-wj2f6                                  1/1     Running   0             5d7h    10.200.0.3    worker0   <none>           <none>

DNS 服务:

$ kubectl get svc -n kube-system
NAME              TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
coredns-coredns   ClusterIP   10.32.0.10   <none>        53/UDP,53/TCP   5d7h

现在,当我从 pod 进行一些 DNS 解析时debu,它有时会有效:

kubectl exec -it debu -- nslookup -type=a kubernetes.default.svc.cluster.local.
Server:     10.32.0.10
Address:    10.32.0.10#53

Name:   kubernetes.default.svc.cluster.local
Address: 10.32.0.1

但有时却不行:

keti debu -- nslookup -type=a kubernetes.default.svc.cluster.local.
;; communications error to 10.32.0.10#53: timed out
;; communications error to 10.32.0.10#53: timed out

我进一步挖掘并发现,该问题似乎取决于coredns所选的 pod kube-proxy

  • kube-proxy将 DNS 请求转发到10.200.0.3不同的节点比我的debupod)那么分辨率作品
  • kube-proxy将 DNS 请求转发到10.200.2.18同一节点作为我的debupod)然后分辨率不起作用

因此我进行了更深入的研究并捕获了一些流量:

$ kubectl exec -it debu -- tcpdump -vn udp port 53
tcpdump: listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
08:39:34.299059 IP (tos 0x0, ttl 64, id 5764, offset 0, flags [none], proto UDP (17), length 82)

# here kube-proxy chose the "remote" coredns pod
08:39:34.299059 IP (tos 0x0, ttl 64, id 5764, offset 0, flags [none], proto UDP (17), length 82)
    10.200.2.17.35002 > 10.32.0.10.53: 25915+ A? kubernetes.default.svc.cluster.local. (54)
08:39:34.299782 IP (tos 0x0, ttl 62, id 48854, offset 0, flags [DF], proto UDP (17), length 134)
    10.32.0.10.53 > 10.200.2.17.35002: 25915*- 1/0/0 kubernetes.default.svc.cluster.local. A 10.32.0.1 (106)

# here kube-proxy chose the "local" coredns pod
08:39:36.588485 IP (tos 0x0, ttl 64, id 31594, offset 0, flags [none], proto UDP (17), length 82)
    10.200.2.17.45242 > 10.32.0.10.53: 33921+ A? kubernetes.default.svc.cluster.local. (54)
08:39:36.588670 IP (tos 0x0, ttl 64, id 17121, offset 0, flags [DF], proto UDP (17), length 134)
    10.200.2.18.53 > 10.200.2.17.45242: 33921*- 1/0/0 kubernetes.default.svc.cluster.local. A 10.32.0.1 (106)

记下 DNS 的源地址回复。当与远程 coredns pod 通信时,响应来自10.32.0.10(服务地址),但是当与本地 coredns pod 通信时,响应来自10.200.2.18(pod 地址),这与请求的目标地址(服务 IP)不一致,并且很可能导致 DNS 客户端根本收不到此响应。

据我所知,负责此操作的组件是它设置的规则。为什么它不能正确地进行地址转换kube-proxyiptables

以下是制定的iptables规则的转储:worker2kube-proxy

$ sudo iptables-save
# Generated by iptables-save v1.8.7 on Thu Jul 20 08:50:55 2023
*mangle
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:KUBE-IPTABLES-HINT - [0:0]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-PROXY-CANARY - [0:0]
COMMIT
# Completed on Thu Jul 20 08:50:55 2023
# Generated by iptables-save v1.8.7 on Thu Jul 20 08:50:55 2023
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:KUBE-EXTERNAL-SERVICES - [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-FORWARD - [0:0]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-NODEPORTS - [0:0]
:KUBE-PROXY-CANARY - [0:0]
:KUBE-PROXY-FIREWALL - [0:0]
:KUBE-SERVICES - [0:0]
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes load balancer firewall" -j KUBE-PROXY-FIREWALL
-A INPUT -m comment --comment "kubernetes health check service ports" -j KUBE-NODEPORTS
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A INPUT -j KUBE-FIREWALL
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes load balancer firewall" -j KUBE-PROXY-FIREWALL
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes load balancer firewall" -j KUBE-PROXY-FIREWALL
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A KUBE-FIREWALL ! -s 127.0.0.0/8 -d 127.0.0.0/8 -m comment --comment "block incoming localnet connections" -m conntrack ! --ctstate RELATED,ESTABLISHED,DNAT -j DROP
-A KUBE-FORWARD -m conntrack --ctstate INVALID -j DROP
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x4000/0x4000 -j ACCEPT
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
COMMIT
# Completed on Thu Jul 20 08:50:55 2023
# Generated by iptables-save v1.8.7 on Thu Jul 20 08:50:55 2023
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:CNI-6c816826e5cedfcfc87d8961 - [0:0]
:CNI-a465ef0ed6a0180a9e27d1cb - [0:0]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-NODEPORTS - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-PROXY-CANARY - [0:0]
:KUBE-SEP-2PLY6KCKFADRAI56 - [0:0]
:KUBE-SEP-EA76CHRYFR6YRKN6 - [0:0]
:KUBE-SEP-EVM2BZXZKR6FG27U - [0:0]
:KUBE-SEP-IGILC3MHHXCPPD2V - [0:0]
:KUBE-SEP-TVM3X65DZPREBP7U - [0:0]
:KUBE-SEP-VVBZLDDCGYIIOLML - [0:0]
:KUBE-SEP-ZY5Q3ULAQ5ZYZJLS - [0:0]
:KUBE-SERVICES - [0:0]
:KUBE-SVC-3MN7Q5WEBLVAXORV - [0:0]
:KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
:KUBE-SVC-S3NG4EFDCNWS3YQS - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -s 10.200.2.17/32 -m comment --comment "name: \"bridge\" id: \"49fabffb4b0f772ce5a80a41b7062980872561d679a2f32dcae016058125e1eb\"" -j CNI-6c816826e5cedfcfc87d8961
-A POSTROUTING -s 10.200.2.18/32 -m comment --comment "name: \"bridge\" id: \"d7ffc93685956ccff585923788c11383bb3d10f7d8633249a966f51fede7c1ef\"" -j CNI-a465ef0ed6a0180a9e27d1cb
-A CNI-6c816826e5cedfcfc87d8961 -d 10.200.2.0/24 -m comment --comment "name: \"bridge\" id: \"49fabffb4b0f772ce5a80a41b7062980872561d679a2f32dcae016058125e1eb\"" -j ACCEPT
-A CNI-6c816826e5cedfcfc87d8961 ! -d 224.0.0.0/4 -m comment --comment "name: \"bridge\" id: \"49fabffb4b0f772ce5a80a41b7062980872561d679a2f32dcae016058125e1eb\"" -j MASQUERADE
-A CNI-a465ef0ed6a0180a9e27d1cb -d 10.200.2.0/24 -m comment --comment "name: \"bridge\" id: \"d7ffc93685956ccff585923788c11383bb3d10f7d8633249a966f51fede7c1ef\"" -j ACCEPT
-A CNI-a465ef0ed6a0180a9e27d1cb ! -d 224.0.0.0/4 -m comment --comment "name: \"bridge\" id: \"d7ffc93685956ccff585923788c11383bb3d10f7d8633249a966f51fede7c1ef\"" -j MASQUERADE
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-POSTROUTING -m mark ! --mark 0x4000/0x4000 -j RETURN
-A KUBE-POSTROUTING -j MARK --set-xmark 0x4000/0x0
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -j MASQUERADE --random-fully
-A KUBE-SEP-2PLY6KCKFADRAI56 -s 192.168.64.11/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-2PLY6KCKFADRAI56 -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination 192.168.64.11:6443
-A KUBE-SEP-EA76CHRYFR6YRKN6 -s 10.200.0.3/32 -m comment --comment "kube-system/coredns-coredns:tcp-53" -j KUBE-MARK-MASQ
-A KUBE-SEP-EA76CHRYFR6YRKN6 -p tcp -m comment --comment "kube-system/coredns-coredns:tcp-53" -m tcp -j DNAT --to-destination 10.200.0.3:53
-A KUBE-SEP-EVM2BZXZKR6FG27U -s 10.200.2.18/32 -m comment --comment "kube-system/coredns-coredns:tcp-53" -j KUBE-MARK-MASQ
-A KUBE-SEP-EVM2BZXZKR6FG27U -p tcp -m comment --comment "kube-system/coredns-coredns:tcp-53" -m tcp -j DNAT --to-destination 10.200.2.18:53
-A KUBE-SEP-IGILC3MHHXCPPD2V -s 10.200.2.18/32 -m comment --comment "kube-system/coredns-coredns:udp-53" -j KUBE-MARK-MASQ
-A KUBE-SEP-IGILC3MHHXCPPD2V -p udp -m comment --comment "kube-system/coredns-coredns:udp-53" -m udp -j DNAT --to-destination 10.200.2.18:53
-A KUBE-SEP-TVM3X65DZPREBP7U -s 192.168.64.12/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-TVM3X65DZPREBP7U -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination 192.168.64.12:6443
-A KUBE-SEP-VVBZLDDCGYIIOLML -s 192.168.64.10/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-VVBZLDDCGYIIOLML -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination 192.168.64.10:6443
-A KUBE-SEP-ZY5Q3ULAQ5ZYZJLS -s 10.200.0.3/32 -m comment --comment "kube-system/coredns-coredns:udp-53" -j KUBE-MARK-MASQ
-A KUBE-SEP-ZY5Q3ULAQ5ZYZJLS -p udp -m comment --comment "kube-system/coredns-coredns:udp-53" -m udp -j DNAT --to-destination 10.200.0.3:53
-A KUBE-SERVICES -d 10.32.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES -d 10.32.0.10/32 -p udp -m comment --comment "kube-system/coredns-coredns:udp-53 cluster IP" -m udp --dport 53 -j KUBE-SVC-3MN7Q5WEBLVAXORV
-A KUBE-SERVICES -d 10.32.0.10/32 -p tcp -m comment --comment "kube-system/coredns-coredns:tcp-53 cluster IP" -m tcp --dport 53 -j KUBE-SVC-S3NG4EFDCNWS3YQS
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-SVC-3MN7Q5WEBLVAXORV ! -s 10.200.0.0/16 -d 10.32.0.10/32 -p udp -m comment --comment "kube-system/coredns-coredns:udp-53 cluster IP" -m udp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SVC-3MN7Q5WEBLVAXORV -m comment --comment "kube-system/coredns-coredns:udp-53 -> 10.200.0.3:53" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-ZY5Q3ULAQ5ZYZJLS
-A KUBE-SVC-3MN7Q5WEBLVAXORV -m comment --comment "kube-system/coredns-coredns:udp-53 -> 10.200.2.18:53" -j KUBE-SEP-IGILC3MHHXCPPD2V
-A KUBE-SVC-NPX46M4PTMTKRN6Y ! -s 10.200.0.0/16 -d 10.32.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https -> 192.168.64.10:6443" -m statistic --mode random --probability 0.33333333349 -j KUBE-SEP-VVBZLDDCGYIIOLML
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https -> 192.168.64.11:6443" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-2PLY6KCKFADRAI56
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https -> 192.168.64.12:6443" -j KUBE-SEP-TVM3X65DZPREBP7U
-A KUBE-SVC-S3NG4EFDCNWS3YQS ! -s 10.200.0.0/16 -d 10.32.0.10/32 -p tcp -m comment --comment "kube-system/coredns-coredns:tcp-53 cluster IP" -m tcp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SVC-S3NG4EFDCNWS3YQS -m comment --comment "kube-system/coredns-coredns:tcp-53 -> 10.200.0.3:53" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-EA76CHRYFR6YRKN6
-A KUBE-SVC-S3NG4EFDCNWS3YQS -m comment --comment "kube-system/coredns-coredns:tcp-53 -> 10.200.2.18:53" -j KUBE-SEP-EVM2BZXZKR6FG27U
COMMIT
# Completed on Thu Jul 20 08:50:55 2023

答案1

解决方案如下:

modprobe br_netfilter

在所有工作节点上。

为什么?同一节点上的 Pod 连接到同一个桥接接口,这意味着它们在第 2 层连接。这意味着同一节点上两个 Pod 之间的任何流量根本不会通过 iptables

这种行为总体上是有道理的 - 如果 Pod 连接在第 2 层,那么就没有理由通过第 3 层转发流量。然而,这似乎并不总是 Linux 中的默认行为,而且人们通常认为并非如此。通过加载模块br_netfilter,我们强制 Linux 通过 iptables 传递所有内容。

更多信息这里

相关内容