看起来好像在 Amazon AMI(最新版本)上忽略了该resolv.conf
选项。考虑以下情况:use-vc
2016.09
[hadoop@ip-172-20-40-202 ~]$ cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local ec2.internal
options use-vc ndots:5 timeout:2 attempts:5
nameserver 172.20.53.184
nameserver 172.20.0.2
如果我nslookup
以交互方式使用,强制通过 TCP 使用set vc
,则查询工作完全按预期进行:
[hadoop@ip-172-20-40-202 ~]$ nslookup
> set vc
> kafka.default.svc.cluster.local
;; Got recursion not available from 172.20.53.184, trying next server
;; Got recursion not available from 172.20.53.184, trying next server
;; Got recursion not available from 172.20.53.184, trying next server
Server: 172.20.53.184
Address: 172.20.53.184#53
Name: kafka.default.svc.cluster.local
Address: 100.96.14.2
Name: kafka.default.svc.cluster.local
Address: 100.96.7.2
Name: kafka.default.svc.cluster.local
Address: 100.96.13.2
> kafka
Server: 172.20.53.184
Address: 172.20.53.184#53
Name: kafka.default.svc.cluster.local
Address: 100.96.14.2
Name: kafka.default.svc.cluster.local
Address: 100.96.7.2
Name: kafka.default.svc.cluster.local
Address: 100.96.13.2
> exit
然而,如果任其发展,nslookup
就会失败:
[hadoop@ip-172-20-40-202 ~]$ nslookup kafka.default.svc.cluster.local
Server: 172.20.0.2
Address: 172.20.0.2#53
** server can't find kafka.default.svc.cluster.local: NXDOMAIN
与 相同dig
。强制 TCP 按预期工作:
[hadoop@ip-172-20-40-202 ~]$ dig +vc kafka.default.svc.cluster.local
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.47.rc1.52.amzn1 <<>> +vc kafka.default.svc.cluster.local
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55634
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;kafka.default.svc.cluster.local. IN A
;; ANSWER SECTION:
kafka.default.svc.cluster.local. 30 IN A 100.96.13.2
kafka.default.svc.cluster.local. 30 IN A 100.96.14.2
kafka.default.svc.cluster.local. 30 IN A 100.96.7.2
;; Query time: 2 msec
;; SERVER: 172.20.53.184#53(172.20.53.184)
;; WHEN: Thu Mar 16 20:45:06 2017
;; MSG SIZE rcvd: 97
如果不强制 TCP 则失败:
[hadoop@ip-172-20-40-202 ~]$ dig kafka.default.svc.cluster.local
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.47.rc1.52.amzn1 <<>> kafka.default.svc.cluster.local
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 9580
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0
;; QUESTION SECTION:
;kafka.default.svc.cluster.local. IN A
;; AUTHORITY SECTION:
. 52 IN SOA a.root-servers.net. nstld.verisign-grs.com. 2017031602 1800 900 604800 86400
;; Query time: 0 msec
;; SERVER: 172.20.0.2#53(172.20.0.2)
;; WHEN: Thu Mar 16 20:44:58 2017
;; MSG SIZE rcvd: 124
看起来好像use-vc
该行options use-vc ndots:5 timeout:2 attempts:5
被忽略了。
如何正确配置以强制使用 TCP 进行所有 DNS 查询? man resolv.conf
说它应该可以工作!
答案1
看起来诊断工具nslookup
&dig
误导了我。
当我使用时getent
,我发现名称确实被正确解析并且遵守了use-vc
选项/etc/resolv.conf
:
[hadoop@ip-172-20-40-202 ~]$ getent ahosts kafka.default.svc.cluster.local
100.96.13.2 STREAM kafka.default.svc.cluster.local
100.96.13.2 DGRAM
100.96.13.2 RAW
100.96.14.2 STREAM
100.96.14.2 DGRAM
100.96.14.2 RAW
100.96.7.2 STREAM
100.96.7.2 DGRAM
100.96.7.2 RAW
[hadoop@ip-172-20-40-202 ~]$ getent hosts kafka.default.svc.cluster.local
100.96.13.2 kafka.default.svc.cluster.local
100.96.14.2 kafka.default.svc.cluster.local
100.96.7.2 kafka.default.svc.cluster.local
如果我删除use-vc
中的选项/etc/resolv.conf
,getent
则会按预期失败。
谁知道呢,对吧?