我使用 Centos 8.1,内核版本 4.18.0-147.8.1.el8_1.x86_64。这些天我们经常遇到 TCP 数据包丢失的情况。当我使用 dropwatch 时,我得到了以下信息:
dropwatch> start
Enabling monitoring...
Kernel monitoring activated.
Issue Ctrl-C to stop monitoring
2 drops at tcp_v4_rcv+48 (0xffffffff9e7892f8)
4 drops at tcp_v4_do_rcv+6b (0xffffffff9e7882ab)
2 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
1 drops at sk_stream_kill_queues+48 (0xffffffff9e6e34b8)
2 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
6 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
2 drops at tcp_v4_rcv+48 (0xffffffff9e7892f8)
4 drops at tcp_v4_do_rcv+6b (0xffffffff9e7882ab)
4 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
2 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
1 drops at tcp_v4_rcv+48 (0xffffffff9e7892f8)
3 drops at tcp_v4_do_rcv+6b (0xffffffff9e7882ab)
1 drops at tcp_v4_rcv+48 (0xffffffff9e7892f8)
4 drops at tcp_v4_do_rcv+6b (0xffffffff9e7882ab)
4 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
2 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
2 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
4 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
2 drops at tcp_v4_rcv+48 (0xffffffff9e7892f8)
4 drops at tcp_v4_do_rcv+6b (0xffffffff9e7882ab)
1 drops at sk_stream_kill_queues+48 (0xffffffff9e6e34b8)
2 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
2 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
4 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
2 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
4 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
3 drops at sk_stream_kill_queues+48 (0xffffffff9e6e34b8)
1 drops at sk_stream_kill_queues+48 (0xffffffff9e6e34b8)
2 drops at sk_stream_kill_queues+48 (0xffffffff9e6e34b8)
1 drops at tcp_v4_rcv+48 (0xffffffff9e7892f8)
4 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
2 drops at tcp_v4_rcv+48 (0xffffffff9e7892f8)
4 drops at tcp_v4_do_rcv+6b (0xffffffff9e7882ab)
3 drops at sk_stream_kill_queues+48 (0xffffffff9e6e34b8)
1 drops at tcp_v4_rcv+48 (0xffffffff9e7892f8)
1 drops at sk_stream_kill_queues+48 (0xffffffff9e6e34b8)
2 drops at tcp_v4_rcv+48 (0xffffffff9e7892f8)
2 drops at tcp_v4_rcv+48 (0xffffffff9e7892f8)
4 drops at tcp_v4_do_rcv+6b (0xffffffff9e7882ab)
2 drops at unix_stream_connect+4fa (0xffffffff9e7dd9aa)
我的 sysctl.conf 设置是:
vm.swappiness = 0
kernel.sysrq = 1
net.ipv4.neigh.default.gc_stale_time = 120
# see details in https://help.aliyun.com/knowledge_detail/39428.html
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 0
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.lo.arp_announce = 2
net.ipv4.conf.all.arp_announce = 2
# see details in https://help.aliyun.com/knowledge_detail/41334.html
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.tcp_synack_retries = 2
#open bbr
net.core.default_qdisc=fq
net.ipv4.tcp_congestion_control=bbr
net.ipv4.tcp_notsent_lowat = 16384
net.ipv4.tcp_low_latency=1
# Increase the read-buffer space allocatable
net.ipv4.tcp_rmem = 33554432 33554432 33554432
net.ipv4.udp_rmem_min = 16384
net.core.rmem_max = 33554432
# Increase the write-buffer-space allocatable
net.ipv4.tcp_wmem = 33554432 33554432 33554432
net.ipv4.udp_wmem_min = 16384
net.core.wmem_max = 33554432
net.core.netdev_max_backlog = 4096
net.core.dev_weight = 64
net.core.optmem_max = 65535
net.ipv4.tcp_slow_start_after_idle = 0