Microk8s 集群上的 DNS 解析问题

Microk8s 集群上的 DNS 解析问题

我的 microk8s 集群在 centos8 vm 上运行,我的 pod 的 DNS 解析存在一些问题。名称服务器位于,x.x.x.101并且x.x.x.100都可以从 pod 内部 ping 通,我也可以 ping8.8.8.8

pod 内部的 nslookup 如下所示:

root@debug-7857894f66-mnklp:/# nslookup kubernetes.default
Server:         10.152.183.10
Address:        10.152.183.10#53

Name:   kubernetes.default.svc.cluster.local
Address: 10.152.183.1

coredns 配置如下所示:

apiVersion: v1
data:
  Corefile: ".:53 {\n    errors\n    health {\n      lameduck 5s\n    }\n    ready\n
    \   log . {\n      class error\n    }\n    kubernetes cluster.local in-addr.arpa
    ip6.arpa {\n      pods insecure\n      fallthrough in-addr.arpa ip6.arpa\n    }\n
    \   prometheus :9153\n    forward .  x.x.x.101 x.x.x.100 \n    cache 30\n
    \   loop\n    reload\n    loadbalance\n}\n"
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"Corefile":".:53 {\n    errors\n    health {\n      lameduck 5s\n    }\n    ready\n    log . {\n      class error\n    }\n    kubernetes cluster.local in-addr.arpa ip6.arpa {\n      pods insecure\n      fallthrough in-addr.arpa ip6.arpa\n    }\n    prometheus :9153\n    forward .  x.x.x.101 x.x.x.100 \n    cache 30\n    loop\n    reload\n    loadbalance\n}\n"},"kind":"ConfigMap","metadata":{"annotations":{},"labels":{"addonmanager.kubernetes.io/mode":"EnsureExists","k8s-app":"kube-dns"},"name":"coredns","namespace":"kube-system"}}
  creationTimestamp: "2021-08-31T08:57:27Z"
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
    k8s-app: kube-dns
  name: coredns
  namespace: kube-system
  resourceVersion: "2420090"
  selfLink: /api/v1/namespaces/kube-system/configmaps/coredns
  uid: 471b258a-253d-4b51-aaf7-7e934ab300d1

我的豆荚/etc/resolv.conf看起来像这样:

search default.svc.cluster.local svc.cluster.local cluster.local xxx.xxxxx
nameserver 10.152.183.10
options ndots:5

当我查看 kube-dns 的日志时,$ microk8s kubectl logs --namespace=kube-system -l k8s-app=kube-dns我收到以下响应:

[INFO] 10.1.107.105:47549 - 5288 "AAAA IN www.google.com. udp 36 false 512" NOERROR - 0 0.000256103s
[ERROR] plugin/errors: 2 www.google.com. AAAA: read udp 10.1.107.127:51486->x.x.x.101:53: read: no route to host

DNS 服务已启动:

$ microk8s kubectl get svc --namespace=kube-system
NAME                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                  AGE
[...]
kube-dns                    ClusterIP   10.152.183.10    <none>        53/UDP,53/TCP,9153/TCP   21d

DNS 端点暴露:

$ microk8s kubectl get endpoints kube-dns --namespace=kube-system
NAME       ENDPOINTS                                           AGE
kube-dns   10.1.107.127:53,10.1.107.127:53,10.1.107.127:9153   21d

集群运行的虚拟机使用相同的名称服务器,没有任何问题。这是我需要打扰管理员的事情吗,还是我遗漏了某些错误配置的内容?

编辑:

好吧,我又尝试了一下,发现有些奇怪的行为,我认为这与我错误地创建coredns配置文件有关。我完全重置了集群并再次尝试,当我使用 8.8.8.8 或 8.8.4.4 作为 DNS 服务器时仍然出现同样的错误。但是,当我运行以下命令时:$ microk8s enable dns:x.x.x.101,x.x.x.100它终于起作用了。然后我尝试使用 进行配置,$ microk8s kubectl -n kube-system edit configmap/coredns并将两个 DNS 添加到配置中,它停止工作,我再次收到错误:[ERROR] plugin/errors: 2 www.google.com. AAAA: read udp 10.1.107.127:51486->x.x.x.101:53: read: no route to host那么我的配置有什么问题,它是使用 自动正确设置的$ microk8s enable dns:x.x.x.101,x.x.x.100

第二次编辑

我尝试在这里使用 dig,即使我指定了 DNS 服务器,它也无法正常工作。ping 可以通过,但 DNS 被系统阻止,这有什么原因吗?它只发生在microk8s cluster主机系统运行 fin docker 运行良好...以下是打印输出:这是来自 pod 内部的:

root@debug-865cb7fb4-wfhw4:/# dig www.google.com

; <<>> DiG 9.11.5-P4-5.1+deb10u5-Debian <<>> www.google.com
;; global options: +cmd
;; connection timed out; no servers could be reached
root@debug-865cb7fb4-wfhw4:/# dig @x.x.x.101 www.google.com

; <<>> DiG 9.11.5-P4-5.1+deb10u5-Debian <<>> @x.x.x.101 www.google.com
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
root@debug-865cb7fb4-wfhw4:/# dig @8.8.8.8 www.google.com

; <<>> DiG 9.11.5-P4-5.1+deb10u5-Debian <<>> @8.8.8.8 www.google.com
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
root@debug-865cb7fb4-wfhw4:/# dig @x.x.x.100 www.google.com

; <<>> DiG 9.11.5-P4-5.1+deb10u5-Debian <<>> @x.x.x.100 www.google.com
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached

这是来自主机系统的信息:

$ dig www.google.com

; <<>> DiG 9.11.20-RedHat-9.11.20-5.el8_3.1 <<>> www.google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 25735
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;www.google.com.                        IN      A

;; ANSWER SECTION:
www.google.com.         113     IN      A       142.250.185.228

;; Query time: 0 msec
;; SERVER: x.x.x.101#53(x.x.x.101)
;; WHEN: Fri Oct 08 15:10:21 CEST 2021
;; MSG SIZE  rcvd: 59



$ dig @8.8.8.8 www.google.com

; <<>> DiG 9.11.20-RedHat-9.11.20-5.el8_3.1 <<>> @8.8.8.8 www.google.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3924
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;www.google.com.                        IN      A

;; ANSWER SECTION:
www.google.com.         300     IN      A       142.250.185.228

;; Query time: 34 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri Oct 08 15:10:49 CEST 2021
;; MSG SIZE  rcvd: 59

$ dig @x.x.x.101 www.google.com

; <<>> DiG 9.11.20-RedHat-9.11.20-5.el8_3.1 <<>> @x.x.x.101 www.google.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 60305
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;www.google.com.                        IN      A

;; ANSWER SECTION:
www.google.com.         70      IN      A       142.250.185.228

;; Query time: 0 msec
;; SERVER: x.x.x.101#53(x.x.x.101)
;; WHEN: Fri Oct 08 15:11:04 CEST 2021
;; MSG SIZE  rcvd: 59

我不知道发生了什么事......

答案1

为了提高可见性,我发布了一个社区维基答案。塔戈尔在评论中提到,问题已解决并且与外部 DNS 有关:

我在完全由我控制的、DNS正常工作的基础设施中重建了集群。

尝试禁用主机上的 iptables 和防火墙,看看是否可以通过 coredns 配置来执行此操作。

有关 DNS 的更多信息,请参阅官方文档

相关内容