ping:名称或服务未知

ping:名称或服务未知

问题

ping: service.sys-dev.company.com: Name or service not known

host、nslookup 和 dig 有效

# host service.sys-dev.company.com
traefik-proxy.ingresscontrollers.svc.cluster.local has address 10.0.60.226

细节

Coredns 区域

sys-dev.company.com:53 {
errors
log
ready
health
rewrite name regex (.*)\.sys-dev\.company\.com traefik-proxy.ingresscontrollers.svc.cluster.local
kubernetes cluster.local in-addr.arpa ip6.arpa {
  pods insecure
  fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
# dig @10.0.0.10  service.sys-dev.company.com

; <<>> DiG 9.18.19-1~deb12u1-Debian <<>> @10.0.0.10 service.sys-dev.company.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27146
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 9ea30e2a15ac33b7 (echoed)
;; QUESTION SECTION:
;service.sys-dev.company.com. IN        A

;; ANSWER SECTION:
traefik-proxy.ingresscontrollers.svc.cluster.local. 5 IN A 10.0.60.226

;; Query time: 3 msec
;; SERVER: 10.0.0.10#53(10.0.0.10) (UDP)
;; WHEN: Mon Jan 08 14:15:48 UTC 2024
;; MSG SIZE  rcvd: 139

我发现 strace 中有一个差异,希望有人能解释一下。我有两个容器,这个命令在它们中的行为不同。

  • /etc/nsswitch.conf和的设置resolv.conf相同。

/etc/nsswitch.conf

passwd:         files
group:          files
shadow:         files
gshadow:        files

hosts:          files dns
networks:       files

protocols:      db files
services:       db files
ethers:         db files
rpc:            db files

netgroup:       nis

/etc/resolv.conf

search xxx.svc.cluster.local svc.cluster.local cluster.local m3ked1pts4futor0jcy3sg2y4d.ax.internal.cloudapp.net
nameserver 10.0.0.10
options ndots:5

好的

最低有效位:Ubuntu 22.04.3 LTS

/etc/hosts

127.0.0.1   localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.142.82.26    dotnet8

完整跟踪

在这里你可以看到最后一个connect调用已经包含了 IP

3095  connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.0.0.10")}, 16) = 0
3095  poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}])
3095  poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}])
3095  recvfrom(5, "\273y\205\0\0\1\0\0\0\1\0\0\vservice\tsys-dev\6company\3com\0\0\34\0\1\7cluster\5local\0\0\6\0\1\0\0\0\5\0D\2ns\3dns\7cluster\5local\0\nhostmaster\7cluster\5local\0e\234\0\30\0\0\34 \0\0\7\10\0\1Q\200\0\0\0\5", 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.0.0.10")}, [28 => 16]) = 143
3095  poll([{fd=5, events=POLLIN}], 1, 4997) = 1 ([{fd=5, revents=POLLIN}])
3095  recvfrom(5, "o\7\205\0\0\1\0\1\0\0\0\0\vservice\tsys-dev\6company\3com\0\0\1\0\1\rtraefik-proxy\22ingresscontrollers\3svc\7cluster\5local\0\0\1\0\1\0\0\0\5\0\4\n\0<\342", 65536, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.0.0.10")}, [28 => 16]) = 116
3095  connect(5, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("10.0.60.226")}, 16) = 0
PING traefik-proxy.ingresscontrollers.svc.cluster.local (10.0.60.226) 56(84) bytes of data.

挪威克朗

最低有效位:Debian GNU/Linux 12

/etc/hosts

127.0.0.1   localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.142.83.141   dotnet8-test

完整跟踪

1153  connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.0.0.10")}, 16) = 0
1153  poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}])
1153  poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}])
1153  recvfrom(5, "\211\226\205\0\0\1\0\0\0\1\0\0\vservice\tsys-dev\6company\3com\0\0\34\0\1\7cluster\5local\0\0\6\0\1\0\0\0\5\0D\2ns\3dns\7cluster\5local\0\nhostmaster\7cluster\5local\0e\233\377\316\0\0\34 \0\0\7\10\0\1Q\200\0\0\0\5", 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.0.0.10")}, [28 => 16]) = 143
1153  poll([{fd=5, events=POLLIN}], 1, 4997) = 1 ([{fd=5, revents=POLLIN}])
1153  recvfrom(5, "|\227\205\0\0\1\0\1\0\0\0\0\vservice\tsys-dev\6company\3com\0\0\1\0\1\rtraefik-proxy\22ingresscontrollers\3svc\7cluster\5local\0\0\1\0\1\0\0\0\5\0\4\n\0<\342", 65536, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.0.0.10")}, [28 => 16]) = 116

最后一次recvfrom通话后,只有

1186  close(5)                          = 0
1186  write(2, "ping: ", 6)             = 6
1186  write(2, "service.sys-dev.company.com: Name or service not known", 59) = 59
1186  write(2, "\n", 1)                 = 1
1186  close(1)                          = 0
1186  close(2)                          = 0
1186  exit_group(2)                     = ?
1186  +++ exited with 2 +++
ping: service.sys-dev.company.com: Name or service not known

我对行为差异的原因非常感兴趣,如果有人能帮我澄清或指导我,我将非常高兴。感谢您的时间。

答案1

ANSWER该问题是由于和部分之间的差异引起的QUESTION。与 CoreDNS 一起使用时需要进行以下修改:

https://github.com/coredns/coredns/discussions/6460

rewrite stop {
        name regex service.sys-dev.company.com traefik-proxy.ingresscontrollers.svc.cluster.local
        answer auto
    }

解释https://github.com/coredns/coredns/tree/master/plugin/rewrite#response-rewrites

在重写传入的 DNS 请求的名称(字段名称)时,CoreDNS 会重写请求的 QUESTION SECTION 部分。可能需要重写请求的 ANSWER SECTION,因为一些 DNS 解析器将问题部分和答案部分之间的不匹配视为中间人攻击 (MITM)

测试表明,对于 curl/dotnet 与 Debian/Alpine 的组合,需要此匹配才能正常工作。ANSWER和之间的差异QUESTION,它将其评估为负面和不存在的记录。

相关内容