问题
ping: service.sys-dev.company.com: Name or service not known
host、nslookup 和 dig 有效
# host service.sys-dev.company.com
traefik-proxy.ingresscontrollers.svc.cluster.local has address 10.0.60.226
细节
Coredns 区域
sys-dev.company.com:53 {
errors
log
ready
health
rewrite name regex (.*)\.sys-dev\.company\.com traefik-proxy.ingresscontrollers.svc.cluster.local
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
# dig @10.0.0.10 service.sys-dev.company.com
; <<>> DiG 9.18.19-1~deb12u1-Debian <<>> @10.0.0.10 service.sys-dev.company.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27146
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 9ea30e2a15ac33b7 (echoed)
;; QUESTION SECTION:
;service.sys-dev.company.com. IN A
;; ANSWER SECTION:
traefik-proxy.ingresscontrollers.svc.cluster.local. 5 IN A 10.0.60.226
;; Query time: 3 msec
;; SERVER: 10.0.0.10#53(10.0.0.10) (UDP)
;; WHEN: Mon Jan 08 14:15:48 UTC 2024
;; MSG SIZE rcvd: 139
我发现 strace 中有一个差异,希望有人能解释一下。我有两个容器,这个命令在它们中的行为不同。
/etc/nsswitch.conf
和的设置resolv.conf
相同。
/etc/nsswitch.conf
passwd: files
group: files
shadow: files
gshadow: files
hosts: files dns
networks: files
protocols: db files
services: db files
ethers: db files
rpc: db files
netgroup: nis
/etc/resolv.conf
search xxx.svc.cluster.local svc.cluster.local cluster.local m3ked1pts4futor0jcy3sg2y4d.ax.internal.cloudapp.net
nameserver 10.0.0.10
options ndots:5
好的
最低有效位:Ubuntu 22.04.3 LTS
/etc/hosts
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.142.82.26 dotnet8
在这里你可以看到最后一个connect
调用已经包含了 IP
3095 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.0.0.10")}, 16) = 0
3095 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}])
3095 poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}])
3095 recvfrom(5, "\273y\205\0\0\1\0\0\0\1\0\0\vservice\tsys-dev\6company\3com\0\0\34\0\1\7cluster\5local\0\0\6\0\1\0\0\0\5\0D\2ns\3dns\7cluster\5local\0\nhostmaster\7cluster\5local\0e\234\0\30\0\0\34 \0\0\7\10\0\1Q\200\0\0\0\5", 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.0.0.10")}, [28 => 16]) = 143
3095 poll([{fd=5, events=POLLIN}], 1, 4997) = 1 ([{fd=5, revents=POLLIN}])
3095 recvfrom(5, "o\7\205\0\0\1\0\1\0\0\0\0\vservice\tsys-dev\6company\3com\0\0\1\0\1\rtraefik-proxy\22ingresscontrollers\3svc\7cluster\5local\0\0\1\0\1\0\0\0\5\0\4\n\0<\342", 65536, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.0.0.10")}, [28 => 16]) = 116
3095 connect(5, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("10.0.60.226")}, 16) = 0
PING traefik-proxy.ingresscontrollers.svc.cluster.local (10.0.60.226) 56(84) bytes of data.
挪威克朗
最低有效位:Debian GNU/Linux 12
/etc/hosts
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.142.83.141 dotnet8-test
1153 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.0.0.10")}, 16) = 0
1153 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}])
1153 poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}])
1153 recvfrom(5, "\211\226\205\0\0\1\0\0\0\1\0\0\vservice\tsys-dev\6company\3com\0\0\34\0\1\7cluster\5local\0\0\6\0\1\0\0\0\5\0D\2ns\3dns\7cluster\5local\0\nhostmaster\7cluster\5local\0e\233\377\316\0\0\34 \0\0\7\10\0\1Q\200\0\0\0\5", 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.0.0.10")}, [28 => 16]) = 143
1153 poll([{fd=5, events=POLLIN}], 1, 4997) = 1 ([{fd=5, revents=POLLIN}])
1153 recvfrom(5, "|\227\205\0\0\1\0\1\0\0\0\0\vservice\tsys-dev\6company\3com\0\0\1\0\1\rtraefik-proxy\22ingresscontrollers\3svc\7cluster\5local\0\0\1\0\1\0\0\0\5\0\4\n\0<\342", 65536, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.0.0.10")}, [28 => 16]) = 116
最后一次recvfrom
通话后,只有
1186 close(5) = 0
1186 write(2, "ping: ", 6) = 6
1186 write(2, "service.sys-dev.company.com: Name or service not known", 59) = 59
1186 write(2, "\n", 1) = 1
1186 close(1) = 0
1186 close(2) = 0
1186 exit_group(2) = ?
1186 +++ exited with 2 +++
ping: service.sys-dev.company.com: Name or service not known
我对行为差异的原因非常感兴趣,如果有人能帮我澄清或指导我,我将非常高兴。感谢您的时间。
答案1
ANSWER
该问题是由于和部分之间的差异引起的QUESTION
。与 CoreDNS 一起使用时需要进行以下修改:
https://github.com/coredns/coredns/discussions/6460
rewrite stop {
name regex service.sys-dev.company.com traefik-proxy.ingresscontrollers.svc.cluster.local
answer auto
}
解释:https://github.com/coredns/coredns/tree/master/plugin/rewrite#response-rewrites
在重写传入的 DNS 请求的名称(字段名称)时,CoreDNS 会重写请求的 QUESTION SECTION 部分。可能需要重写请求的 ANSWER SECTION,因为一些 DNS 解析器将问题部分和答案部分之间的不匹配视为中间人攻击 (MITM)。
测试表明,对于 curl/dotnet 与 Debian/Alpine 的组合,需要此匹配才能正常工作。ANSWER
和之间的差异QUESTION
,它将其评估为负面和不存在的记录。