当 sysctl tcp_retries1 设置为 3 时,TCP 数据包被重新传输 7 次 - 为什么?

当 sysctl tcp_retries1 设置为 3 时,TCP 数据包被重新传输 7 次 - 为什么?

Ubuntu 12.04

我试图更好地理解当 TCP 没有收到目的地收到数据包的确认时,它将尝试重新传输数据包多少次。阅读tcp 手册页很明显,这是由 sysctl tcp_retries1 控制的:

tcp_retries1 (integer; default: 3)
           The number of times TCP will attempt to retransmit a  packet  on
           an  established connection normally, without the extra effort of
           getting the network layers involved.  Once we exceed this number
           of retransmits, we first have the network layer update the route
           if possible before each new retransmit.  The default is the  RFC
           specified minimum of 3.

我的系统设置为默认值3:

# cat /proc/sys/net/ipv4/tcp_retries1 
3

为了测试这一点,我通过 ssh 从系统 A (172.16.249.138) 连接到系统 B (172.16.249.137),并在控制台上启动了一个简单的打印循环。然后,在通信进行时,我突然将 B 从网络上断开。

在另一个终端中,我在系统 A 上运行“tcpdump host 172.16.249.137”。下面是输出中的相关行(为清楚起见添加了行号)。

00: ...
01: 13:29:46.994715 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [.], ack 5989441, win 80, options [nop,nop,TS val 1957286 ecr 4294962520], length 0
02: 13:29:46.995084 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [.], ack 5989441, win 186, options [nop,nop,TS val 1957286 ecr 4294962520], length 0    
03: 13:29:47.040360 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 186, options [nop,nop,TS val 1957298 ecr 4294962520], length 48
04: 13:29:47.086552 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [.], ack 5989441, win 376, options [nop,nop,TS val 1957309 ecr 4294962520], length 0
05: 13:29:47.680608 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1957458 ecr 4294962520], length 48
06: 13:29:48.963721 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1957779 ecr 4294962520], length 48
07: 13:29:51.528564 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1958420 ecr 4294962520], length 48
08: 13:29:56.664384 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1959704 ecr 4294962520], length 48
09: 13:30:06.936480 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1962272 ecr 4294962520], length 48
10: 13:30:27.480381 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1967408 ecr 4294962520], length 48
11: 13:31:08.504033 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1977664 ecr 4294962520], length 48
12: 13:31:13.512437 ARP, Request who-has 172.16.249.137 tell 172.16.249.138, length 28
13: 13:31:14.512336 ARP, Request who-has 172.16.249.137 tell 172.16.249.138, length 28
14: 13:31:15.512241 ARP, Request who-has 172.16.249.137 tell 172.16.249.138, length 28

如果我理解正确的话(也可能不正确),第 3 行的数据包永远不会被系统 B 确认。然后 A 会重试发送此数据包 7 次(第 5-11 行),每次都会增加其重传计时器(每次大约加倍)。

为什么数据包被重传7次而不是3次?

注意:在注意到一些 pcap 文件中通过 HTTP 连接发生 6-7 次重传后,我执行了这个正式测试,因此重传次数似乎与 SSH 无关。

答案1

我相信您通过终止 .137 服务器上的连接创建了一个孤立套接字。因此,使用的内核参数将是 tcp_orphan_retries - 它的通用 Linux 默认值为 7。

您可以在此处获得所创建的条件和结果的描述: http://www.linuxinsight.com/proc_sys_net_ipv4_tcp_orphan_retries.html

相关内容