在 Amazon/AMI 映像上强制通过 TCP 进行 DNS 查询的神奇咒语是什么?

在 Amazon/AMI 映像上强制通过 TCP 进行 DNS 查询的神奇咒语是什么?

看起来好像在 Amazon AMI(最新版本)上忽略了该resolv.conf选项。考虑以下情况:use-vc2016.09

[hadoop@ip-172-20-40-202 ~]$ cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local ec2.internal
options use-vc ndots:5 timeout:2 attempts:5
nameserver 172.20.53.184
nameserver 172.20.0.2

如果我nslookup以交互方式使用,强制通过 TCP 使用set vc,则查询工作完全按预期进行:

[hadoop@ip-172-20-40-202 ~]$ nslookup
> set vc
> kafka.default.svc.cluster.local
;; Got recursion not available from 172.20.53.184, trying next server
;; Got recursion not available from 172.20.53.184, trying next server
;; Got recursion not available from 172.20.53.184, trying next server
Server:     172.20.53.184
Address:    172.20.53.184#53

Name:   kafka.default.svc.cluster.local
Address: 100.96.14.2
Name:   kafka.default.svc.cluster.local
Address: 100.96.7.2
Name:   kafka.default.svc.cluster.local
Address: 100.96.13.2
> kafka
Server:     172.20.53.184
Address:    172.20.53.184#53

Name:   kafka.default.svc.cluster.local
Address: 100.96.14.2
Name:   kafka.default.svc.cluster.local
Address: 100.96.7.2
Name:   kafka.default.svc.cluster.local
Address: 100.96.13.2
> exit

然而,如果任其发展,nslookup就会失败:

[hadoop@ip-172-20-40-202 ~]$ nslookup kafka.default.svc.cluster.local
Server:     172.20.0.2
Address:    172.20.0.2#53

** server can't find kafka.default.svc.cluster.local: NXDOMAIN

与 相同dig。强制 TCP 按预期工作:

[hadoop@ip-172-20-40-202 ~]$ dig +vc kafka.default.svc.cluster.local

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.47.rc1.52.amzn1 <<>> +vc kafka.default.svc.cluster.local
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55634
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;kafka.default.svc.cluster.local. IN    A

;; ANSWER SECTION:
kafka.default.svc.cluster.local. 30 IN  A   100.96.13.2
kafka.default.svc.cluster.local. 30 IN  A   100.96.14.2
kafka.default.svc.cluster.local. 30 IN  A   100.96.7.2

;; Query time: 2 msec
;; SERVER: 172.20.53.184#53(172.20.53.184)
;; WHEN: Thu Mar 16 20:45:06 2017
;; MSG SIZE  rcvd: 97

如果不强制 TCP 则失败:

[hadoop@ip-172-20-40-202 ~]$ dig kafka.default.svc.cluster.local

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.47.rc1.52.amzn1 <<>> kafka.default.svc.cluster.local
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 9580
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;kafka.default.svc.cluster.local. IN    A

;; AUTHORITY SECTION:
.           52  IN  SOA a.root-servers.net. nstld.verisign-grs.com. 2017031602 1800 900 604800 86400

;; Query time: 0 msec
;; SERVER: 172.20.0.2#53(172.20.0.2)
;; WHEN: Thu Mar 16 20:44:58 2017
;; MSG SIZE  rcvd: 124

看起来好像use-vc该行options use-vc ndots:5 timeout:2 attempts:5被忽略了。

如何正确配置以强制使用 TCP 进行所有 DNS 查询? man resolv.conf说它应该可以工作!

答案1

看起来诊断工具nslookupdig误导了我。

当我使用时getent,我发现名称确实被正确解析并且遵守了use-vc选项/etc/resolv.conf

[hadoop@ip-172-20-40-202 ~]$ getent ahosts kafka.default.svc.cluster.local
100.96.13.2     STREAM kafka.default.svc.cluster.local
100.96.13.2     DGRAM
100.96.13.2     RAW
100.96.14.2     STREAM
100.96.14.2     DGRAM
100.96.14.2     RAW
100.96.7.2      STREAM
100.96.7.2      DGRAM
100.96.7.2      RAW
[hadoop@ip-172-20-40-202 ~]$ getent hosts kafka.default.svc.cluster.local
100.96.13.2     kafka.default.svc.cluster.local
100.96.14.2     kafka.default.svc.cluster.local
100.96.7.2      kafka.default.svc.cluster.local

如果我删除use-vc中的选项/etc/resolv.confgetent则会按预期失败。

谁知道呢,对吧?

相关内容