排除 DNS 解析、丢包和 dig/nslookup 故障

排除 DNS 解析、丢包和 dig/nslookup 故障

我一直在尝试解决使用 curl 查找命令有时需要很长时间而有时又很快的问题。

TL;DR 问题:我如何进一步排除故障。

直接连接到 IP 地址没有问题。

我尝试了两个不同的位置,多台机器(RPi,debian vm,Windows10)。

我尝试将名称服务器设置为 1.1.1.1 和 8.8.8.8,结果相同

我已经更改了 /etc/resolv.conf 以便它包含(是的,我已对其进行了写保护)

options timeout:1

因此现在 dns_resolution 成功不再需要 5 秒,而是 1.00xx 秒(有时),有时是 0.00xxx 秒。

对我来说,它首先看起来像是防火墙丢包问题或 IPv6 尝试问题。但是,我已经在机器上禁用了 IPv6,但没有成功。

我使用这个命令进行查找:

while :; do curl --max-time 10 -w "time_namelookup: %{time_namelookup}, time_connect: %{time_connect}, time_appconnect: %{time_appconnect}, time_pretransfer: %{time_pretransfer}, time_redirect: %{time_redirect}, time_starttransfer: %{time_starttransfer}, time_total: %{time_total}\n" -o /dev/null -s "https://www.google.com"; sleep 1; done

以下是一些示例结果:

time_namelookup: 1.059151, time_connect: 1.083971, time_appconnect: 1.129219, time_pretransfer: 1.129803, time_redirect: 0.000000, time_starttransfer: 1.191138, time_total: 1.198772
time_namelookup: 1.058956, time_connect: 1.084363, time_appconnect: 1.130112, time_pretransfer: 1.131392, time_redirect: 0.000000, time_starttransfer: 1.291702, time_total: 1.300107
time_namelookup: 1.057719, time_connect: 1.083355, time_appconnect: 1.128083, time_pretransfer: 1.128726, time_redirect: 0.000000, time_starttransfer: 1.190859, time_total: 1.198635
time_namelookup: 0.018382, time_connect: 0.043042, time_appconnect: 0.077228, time_pretransfer: 0.077517, time_redirect: 0.000000, time_starttransfer: 0.142618, time_total: 0.150624
time_namelookup: 1.058276, time_connect: 1.082821, time_appconnect: 1.127496, time_pretransfer: 1.128083, time_redirect: 0.000000, time_starttransfer: 1.188930, time_total: 1.197614
time_namelookup: 0.023630, time_connect: 0.047877, time_appconnect: 0.079810, time_pretransfer: 0.080053, time_redirect: 0.000000, time_starttransfer: 0.136007, time_total: 0.141614
time_namelookup: 0.018607, time_connect: 0.043160, time_appconnect: 0.079620, time_pretransfer: 0.079966, time_redirect: 0.000000, time_starttransfer: 0.139617, time_total: 0.145728
time_namelookup: 1.059191, time_connect: 1.084436, time_appconnect: 1.127247, time_pretransfer: 1.127825, time_redirect: 0.000000, time_starttransfer: 1.189962, time_total: 1.199934
time_namelookup: 1.058040, time_connect: 1.083303, time_appconnect: 1.127172, time_pretransfer: 1.128314, time_redirect: 0.000000, time_starttransfer: 1.194481, time_total: 1.202276
time_namelookup: 0.018539, time_connect: 0.043664, time_appconnect: 0.079772, time_pretransfer: 0.080074, time_redirect: 0.000000, time_starttransfer: 0.141406, time_total: 0.147358
time_namelookup: 1.060180, time_connect: 1.084996, time_appconnect: 1.130302, time_pretransfer: 1.131054, time_redirect: 0.000000, time_starttransfer: 1.197778, time_total: 1.205363
time_namelookup: 0.031891, time_connect: 0.058458, time_appconnect: 0.097752, time_pretransfer: 0.098055, time_redirect: 0.000000, time_starttransfer: 0.157533, time_total: 0.188634
time_namelookup: 1.059059, time_connect: 1.083472, time_appconnect: 1.126565, time_pretransfer: 1.127196, time_redirect: 0.000000, time_starttransfer: 1.189149, time_total: 1.195730
time_namelookup: 0.023647, time_connect: 0.048197, time_appconnect: 0.082277, time_pretransfer: 0.082772, time_redirect: 0.000000, time_starttransfer: 0.143120, time_total: 0.149757
time_namelookup: 0.017073, time_connect: 0.041658, time_appconnect: 0.081927, time_pretransfer: 0.082310, time_redirect: 0.000000, time_starttransfer: 0.144873, time_total: 0.152739
time_namelookup: 0.017251, time_connect: 0.042056, time_appconnect: 0.079905, time_pretransfer: 0.080475, time_redirect: 0.000000, time_starttransfer: 0.141332, time_total: 0.147886

然而执行 dig 命令非常快:

 $ dig @dns.google google.com

; <<>> DiG 9.11.5-P4-5.1+deb10u8-Raspbian <<>> @dns.google google.com
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13497
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;google.com.                    IN      A

;; ANSWER SECTION:
google.com.             221     IN      A       172.217.19.78

;; Query time: 25 msec
;; SERVER: 8.8.4.4#53(8.8.4.4)
;; WHEN: Thu Feb 16 10:44:24 UTC 2023
;; MSG SIZE  rcvd: 55



$ dig @1.1.1.1 google.com

; <<>> DiG 9.11.5-P4-5.1+deb10u8-Raspbian <<>> @1.1.1.1 google.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 54885
;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;google.com.                    IN      A

;; ANSWER SECTION:
google.com.             65      IN      A       142.250.147.101
google.com.             65      IN      A       142.250.147.113
google.com.             65      IN      A       142.250.147.138
google.com.             65      IN      A       142.250.147.139
google.com.             65      IN      A       142.250.147.100
google.com.             65      IN      A       142.250.147.102

;; Query time: 13 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Thu Feb 16 10:44:33 UTC 2023
;; MSG SIZE  rcvd: 135

这与来自丹麦的多个互联网连接类似

我还发现有些地方根本没有问题。一台设备上的 DNS 解析速度很快,而另一台运行相同 ASN 号但不同子 IP 范围的设备却出现了问题。

MTR/traceroute 没有显示任何问题的迹象。我只是不确定为什么它会转到 100.xxx IP 地址,但肯定有某种路由。其他连接上不会发生这种情况

                         My traceroute  [v0.92]
devicename (xx.x.x.xxx)                          2023-02-16T11:50:47+0000
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                               Packets               Pings
 Host                        Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. xx.x.x.x                  0.0%   211    3.0   2.2   1.4   8.3   0.8
 2. 31.3.xx.xxx               0.0%   211    1.7   2.1   1.4   7.3   0.7
 3. 100.64.x.xxx              0.0%   211    6.8   8.1   6.1  37.8   3.5
 4. 94.127.48.17              1.4%   211    7.4   8.3   6.1  48.0   6.3
 5. xe-2-1-0-311.alb2nqe10.d  0.0%   211    6.9   9.4   6.3  46.4   6.3
 6. ae0-0.stkm2nqp7.se.ip.td  0.0%   211   15.7  16.4  14.8  35.1   2.8
 7. 72.14.214.162             0.0%   211   17.4  17.1  15.3  33.4   2.7
 8. 209.85.254.3              0.0%   211   16.1  17.0  15.4  48.7   3.8
 9. 108.170.253.181           0.9%   211   18.4  17.8  15.2  60.1   6.2
10. 108.170.227.249          10.0%   211   20.6  17.8  15.4  41.2   4.2
    209.85.246.61
11. 142.251.51.215            2.8%   211   26.7  26.4  24.0  54.7   4.6
    209.85.248.7
12. 142.250.214.184           1.4%   210   44.6  25.9  23.7  48.6   4.1
    142.251.239.219
13. 172.253.70.203            0.5%   210   25.1  24.8  23.9  28.2   0.6
    142.250.46.23
14. ???
15. ???
16. ???
17. ???
18. ???
19. ???
20. ???
21. ???
22. ???
23. rd-in-f139.1e100.net     38.3%   210   24.8  25.0  24.1  30.7   0.7

Traceroute 显示它转到一个实际的 IP 地址,然后转到一个虚假的 (100.xxx) IP 地址。

traceroute to google.com (142.250.147.102), 30 hops max, 60 byte packets
 1  xx.x.x.x (xx.x.x.x)  2.436 ms  2.624 ms  2.510 ms
 2  31.3.xx.xxx (31.3.xx.xxx)  2.392 ms  2.941 ms  2.834 ms
 3  100.64.x.xxx (100.64.x.xxx)  6.451 ms  6.629 ms  6.519 ms
 4  94.127.48.17 (94.127.48.17)  6.403 ms  6.600 ms  6.762 ms
 5  xe-2-1-0-311.alb2nqe10.dk.ip.tdc.net (87.48.169.225)  6.653 ms  8.382 ms  8.074 ms
 6  ae0-0.stkm2nqp7.se.ip.tdc.net (83.88.2.128)  16.956 ms  16.121 ms  15.401 ms
 7  72.14.214.162 (72.14.214.162)  16.827 ms  16.741 ms  16.650 ms
 8  * * *
 9  108.170.254.33 (108.170.254.33)  17.564 ms 209.85.242.10 (209.85.242.10)  16.346 ms 108.170.253.161 (108.170.253.161)  17.412 ms
10  108.170.254.38 (108.170.254.38)  17.255 ms 108.170.253.182 (108.170.253.182)  17.105 ms 108.170.254.54 (108.170.254.54)  16.645 ms
11  209.85.246.61 (209.85.246.61)  16.484 ms 72.14.234.107 (72.14.234.107)  17.352 ms 142.250.235.228 (142.250.235.228)  17.045 ms
12  108.170.236.41 (108.170.236.41)  26.269 ms  26.088 ms 142.251.51.217 (142.251.51.217)  26.110 ms
13  142.251.239.219 (142.251.239.219)  24.352 ms 142.250.214.198 (142.250.214.198)  26.021 ms 142.250.214.178 (142.250.214.178)  26.295 ms
14  142.250.214.229 (142.250.214.229)  25.056 ms 142.250.215.9 (142.250.215.9)  25.494 ms 142.250.229.93 (142.250.229.93)  25.310 ms
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * rd-in-f102.1e100.net (142.250.147.102)  25.956 ms *

答案1

假设输出“dig”仅发出一个 A 请求(即一个 dns 请求),并且 mtr 处于 ipv4 模式,并具有其 RTT 时间。假设在 curl 的情况下,有两个 dns 请求:针对 ipv4 和 ipv6(A 和 AAAA)。您是否尝试过使用“-4”选项运行该 curl 示例以比较其仅针对 ipv4 解析的“time_namelookup”?

相关内容