我有一台配备双端口 10G Broadcom NIC(BCM5719 - 使用 bnx2x 驱动程序)的 Dell R610;它有 ~1000 个 pppoe 用户,平均吞吐量为 1-1.5 Gbps,峰值高达 3 Gbps。
我注意到,当流量超过 1 Gbps 时,网关会发生数据包丢失,我查看了接口统计数据,结果发现,与预期的丢包不同,我只发现了 rx 错误(两个端口都有,但 WAN 端口最多)和超限。第一个反应是更换端口之间的 SFP 模块,然后测试新的 SFP 模块,然后我用可用的相同冷备用模块替换了机器——但没有任何改善。
读了一些资料后,我开始将环形缓冲区增加到最大值 ( ethtool -G enp5s0f0 4078
),并尝试卸载设置 ( ethtool -K enp5s0f0 gro off gso off gso off lro off
)。我猜想第二个命令起了作用,但我不能 100% 确定,因为我还升级了 Linux 发行版 (debian) 到最新版本,以及 bnx2x 驱动程序及其非免费固件。
另外,我还将默认值nf_conntrack_tcp_timeout_established
从默认的 432000 = 5 天 - 10 分钟缩短了(尽管从未填满,但现在 conntrack 表已从 ~110k 减少到 ~15k 个已建立连接)。
不确定是这个还是上面的 ethtool 设置造成了差异,但我已经摆脱了数据包丢失问题,现在
目前的情况:不再有数据包丢失,流量似乎不再受到影响,但 rx 错误计数器却在不断增加(当吞吐量超过 1.5 Gbps 时速度会更快)。
我没有丢包,无论是在接口计数器上还是由于 conntrack 表已满;超限的数量相当大,直到
下图显示了 5 天正常运行时间的错误计数器(取自 ifconfig)和接口统计信息。
我将非常感激对这种现象的解释,并且如果可能的话,缓解这个问题。
[root@gw01]:~# ethtool -S enp5s0f0
NIC statistics:
[0]: rx_bytes: 5754726086899
[0]: rx_ucast_packets: 4555048326
[0]: rx_mcast_packets: 0
[0]: rx_bcast_packets: 0
[0]: rx_discards: 0
[0]: rx_phy_ip_err_discards: 0
[0]: rx_skb_alloc_discard: 0
[0]: rx_csum_offload_errors: 0
[0]: tx_exhaustion_events: 0
[0]: tx_bytes: 6070322710813
[0]: tx_ucast_packets: 15474233735
[0]: tx_mcast_packets: 9
[0]: tx_bcast_packets: 0
[0]: tpa_aggregations: 0
[0]: tpa_aggregated_frames: 0
[0]: tpa_bytes: 0
[0]: driver_filtered_tx_pkt: 0
[1]: rx_bytes: 6111893987703
[1]: rx_ucast_packets: 4778498023
[1]: rx_mcast_packets: 0
[1]: rx_bcast_packets: 0
[1]: rx_discards: 0
[1]: rx_phy_ip_err_discards: 0
[1]: rx_skb_alloc_discard: 0
[1]: rx_csum_offload_errors: 53
[1]: tx_exhaustion_events: 0
[1]: tx_bytes: 16548597730
[1]: tx_ucast_packets: 60592175
[1]: tx_mcast_packets: 0
[1]: tx_bcast_packets: 0
[1]: tpa_aggregations: 0
[1]: tpa_aggregated_frames: 0
[1]: tpa_bytes: 0
[1]: driver_filtered_tx_pkt: 0
[2]: rx_bytes: 5807120321999
[2]: rx_ucast_packets: 4580626375
[2]: rx_mcast_packets: 0
[2]: rx_bcast_packets: 0
[2]: rx_discards: 0
[2]: rx_phy_ip_err_discards: 0
[2]: rx_skb_alloc_discard: 0
[2]: rx_csum_offload_errors: 25
[2]: tx_exhaustion_events: 0
[2]: tx_bytes: 17307821972
[2]: tx_ucast_packets: 64955978
[2]: tx_mcast_packets: 143
[2]: tx_bcast_packets: 1
[2]: tpa_aggregations: 0
[2]: tpa_aggregated_frames: 0
[2]: tpa_bytes: 0
[2]: driver_filtered_tx_pkt: 0
[3]: rx_bytes: 7585668862381
[3]: rx_ucast_packets: 6050090451
[3]: rx_mcast_packets: 0
[3]: rx_bcast_packets: 0
[3]: rx_discards: 0
[3]: rx_phy_ip_err_discards: 0
[3]: rx_skb_alloc_discard: 0
[3]: rx_csum_offload_errors: 0
[3]: tx_exhaustion_events: 0
[3]: tx_bytes: 15857583114
[3]: tx_ucast_packets: 62050481
[3]: tx_mcast_packets: 2
[3]: tx_bcast_packets: 0
[3]: tpa_aggregations: 0
[3]: tpa_aggregated_frames: 0
[3]: tpa_bytes: 0
[3]: driver_filtered_tx_pkt: 0
[4]: rx_bytes: 5507320349168
[4]: rx_ucast_packets: 4323030058
[4]: rx_mcast_packets: 0
[4]: rx_bcast_packets: 0
[4]: rx_discards: 0
[4]: rx_phy_ip_err_discards: 0
[4]: rx_skb_alloc_discard: 0
[4]: rx_csum_offload_errors: 0
[4]: tx_exhaustion_events: 0
[4]: tx_bytes: 18875914644
[4]: tx_ucast_packets: 61667057
[4]: tx_mcast_packets: 0
[4]: tx_bcast_packets: 1
[4]: tpa_aggregations: 0
[4]: tpa_aggregated_frames: 0
[4]: tpa_bytes: 0
[4]: driver_filtered_tx_pkt: 0
[5]: rx_bytes: 7597606902068
[5]: rx_ucast_packets: 5984661040
[5]: rx_mcast_packets: 0
[5]: rx_bcast_packets: 0
[5]: rx_discards: 0
[5]: rx_phy_ip_err_discards: 0
[5]: rx_skb_alloc_discard: 0
[5]: rx_csum_offload_errors: 1
[5]: tx_exhaustion_events: 0
[5]: tx_bytes: 15257461291
[5]: tx_ucast_packets: 61970141
[5]: tx_mcast_packets: 0
[5]: tx_bcast_packets: 0
[5]: tpa_aggregations: 0
[5]: tpa_aggregated_frames: 0
[5]: tpa_bytes: 0
[5]: driver_filtered_tx_pkt: 0
[6]: rx_bytes: 6104830059179
[6]: rx_ucast_packets: 4796493913
[6]: rx_mcast_packets: 0
[6]: rx_bcast_packets: 0
[6]: rx_discards: 910
[6]: rx_phy_ip_err_discards: 0
[6]: rx_skb_alloc_discard: 0
[6]: rx_csum_offload_errors: 1
[6]: tx_exhaustion_events: 0
[6]: tx_bytes: 17389300423
[6]: tx_ucast_packets: 64203382
[6]: tx_mcast_packets: 0
[6]: tx_bcast_packets: 0
[6]: tpa_aggregations: 0
[6]: tpa_aggregated_frames: 0
[6]: tpa_bytes: 0
[6]: driver_filtered_tx_pkt: 0
[7]: rx_bytes: 6185384387977
[7]: rx_ucast_packets: 4817905006
[7]: rx_mcast_packets: 0
[7]: rx_bcast_packets: 0
[7]: rx_discards: 0
[7]: rx_phy_ip_err_discards: 0
[7]: rx_skb_alloc_discard: 0
[7]: rx_csum_offload_errors: 0
[7]: tx_exhaustion_events: 0
[7]: tx_bytes: 16405943750
[7]: tx_ucast_packets: 60554882
[7]: tx_mcast_packets: 0
[7]: tx_bcast_packets: 0
[7]: tpa_aggregations: 0
[7]: tpa_aggregated_frames: 0
[7]: tpa_bytes: 0
[7]: driver_filtered_tx_pkt: 0
rx_bytes: 50654550957374
rx_error_bytes: 0
rx_ucast_packets: 39886353192
rx_mcast_packets: 0
rx_bcast_packets: 0
rx_crc_errors: 0
rx_align_errors: 0
rx_undersize_packets: 0
rx_oversize_packets: 0
rx_fragments: 0
rx_jabbers: 0
rx_discards: 910
rx_filtered_packets: 49130
rx_mf_tag_discard: 0
pfc_frames_received: 0
pfc_frames_sent: 0
rx_brb_discard: 27405292
rx_brb_truncate: 1329322
rx_pause_frames: 0
rx_mac_ctrl_frames: 0
rx_constant_pause_events: 0
rx_phy_ip_err_discards: 0
rx_skb_alloc_discard: 0
rx_csum_offload_errors: 80
tx_exhaustion_events: 0
tx_bytes: 6187965333737
tx_error_bytes: 0
tx_ucast_packets: 15910227831
tx_mcast_packets: 154
tx_bcast_packets: 2
tx_mac_errors: 0
tx_carrier_errors: 0
tx_single_collisions: 0
tx_multi_collisions: 0
tx_deferred: 0
tx_excess_collisions: 0
tx_late_collisions: 0
tx_total_collisions: 0
tx_64_byte_packets: 2006785053
tx_65_to_127_byte_packets: 8659794036
tx_128_to_255_byte_packets: 1069905700
tx_256_to_511_byte_packets: 348433834
tx_512_to_1023_byte_packets: 446363890
tx_1024_to_1522_byte_packets: 3378961793
tx_1523_to_9022_byte_packets: 0
tx_pause_frames: 7912342
tpa_aggregations: 0
tpa_aggregated_frames: 0
tpa_bytes: 0
recoverable_errors: 0
unrecoverable_errors: 0
driver_filtered_tx_pkt: 0
Tx LPI entry count: 0
ptp_skipped_tx_tstamp: 0
[root@gw01]:~# ethtool -S enp5s0f1
NIC statistics:
[0]: rx_bytes: 6254611311587
[0]: rx_ucast_packets: 15509699659
[0]: rx_mcast_packets: 1380
[0]: rx_bcast_packets: 2526946
[0]: rx_discards: 15041
[0]: rx_phy_ip_err_discards: 0
[0]: rx_skb_alloc_discard: 0
[0]: rx_csum_offload_errors: 0
[0]: tx_exhaustion_events: 0
[0]: tx_bytes: 6265388531698
[0]: tx_ucast_packets: 4863734758
[0]: tx_mcast_packets: 1361
[0]: tx_bcast_packets: 0
[0]: tpa_aggregations: 0
[0]: tpa_aggregated_frames: 0
[0]: tpa_bytes: 0
[0]: driver_filtered_tx_pkt: 0
[1]: rx_bytes: 4399727408
[1]: rx_ucast_packets: 8829318
[1]: rx_mcast_packets: 473
[1]: rx_bcast_packets: 35434
[1]: rx_discards: 0
[1]: rx_phy_ip_err_discards: 0
[1]: rx_skb_alloc_discard: 0
[1]: rx_csum_offload_errors: 0
[1]: tx_exhaustion_events: 0
[1]: tx_bytes: 6513873746426
[1]: tx_ucast_packets: 5033111560
[1]: tx_mcast_packets: 1207
[1]: tx_bcast_packets: 0
[1]: tpa_aggregations: 0
[1]: tpa_aggregated_frames: 0
[1]: tpa_bytes: 0
[1]: driver_filtered_tx_pkt: 0
[2]: rx_bytes: 5185615964
[2]: rx_ucast_packets: 8755258
[2]: rx_mcast_packets: 514
[2]: rx_bcast_packets: 2329
[2]: rx_discards: 0
[2]: rx_phy_ip_err_discards: 0
[2]: rx_skb_alloc_discard: 0
[2]: rx_csum_offload_errors: 0
[2]: tx_exhaustion_events: 0
[2]: tx_bytes: 6367820159824
[2]: tx_ucast_packets: 4904543961
[2]: tx_mcast_packets: 1622
[2]: tx_bcast_packets: 4268
[2]: tpa_aggregations: 0
[2]: tpa_aggregated_frames: 0
[2]: tpa_bytes: 0
[2]: driver_filtered_tx_pkt: 0
[3]: rx_bytes: 2903903736
[3]: rx_ucast_packets: 8047737
[3]: rx_mcast_packets: 544
[3]: rx_bcast_packets: 7248
[3]: rx_discards: 0
[3]: rx_phy_ip_err_discards: 0
[3]: rx_skb_alloc_discard: 0
[3]: rx_csum_offload_errors: 0
[3]: tx_exhaustion_events: 0
[3]: tx_bytes: 6592027526021
[3]: tx_ucast_packets: 5129457175
[3]: tx_mcast_packets: 1034
[3]: tx_bcast_packets: 0
[3]: tpa_aggregations: 0
[3]: tpa_aggregated_frames: 0
[3]: tpa_bytes: 0
[3]: driver_filtered_tx_pkt: 0
[4]: rx_bytes: 6931882922
[4]: rx_ucast_packets: 10206552
[4]: rx_mcast_packets: 1719
[4]: rx_bcast_packets: 4319
[4]: rx_discards: 0
[4]: rx_phy_ip_err_discards: 0
[4]: rx_skb_alloc_discard: 0
[4]: rx_csum_offload_errors: 0
[4]: tx_exhaustion_events: 0
[4]: tx_bytes: 6448967965669
[4]: tx_ucast_packets: 5022492659
[4]: tx_mcast_packets: 971
[4]: tx_bcast_packets: 0
[4]: tpa_aggregations: 0
[4]: tpa_aggregated_frames: 0
[4]: tpa_bytes: 0
[4]: driver_filtered_tx_pkt: 0
[5]: rx_bytes: 3800756009
[5]: rx_ucast_packets: 8248957
[5]: rx_mcast_packets: 477
[5]: rx_bcast_packets: 3161083
[5]: rx_discards: 0
[5]: rx_phy_ip_err_discards: 0
[5]: rx_skb_alloc_discard: 0
[5]: rx_csum_offload_errors: 0
[5]: tx_exhaustion_events: 0
[5]: tx_bytes: 6111594864886
[5]: tx_ucast_packets: 4809171334
[5]: tx_mcast_packets: 2055
[5]: tx_bcast_packets: 0
[5]: tpa_aggregations: 0
[5]: tpa_aggregated_frames: 0
[5]: tpa_bytes: 0
[5]: driver_filtered_tx_pkt: 0
[6]: rx_bytes: 5162315054
[6]: rx_ucast_packets: 9801307
[6]: rx_mcast_packets: 376
[6]: rx_bcast_packets: 5682
[6]: rx_discards: 0
[6]: rx_phy_ip_err_discards: 0
[6]: rx_skb_alloc_discard: 0
[6]: rx_csum_offload_errors: 0
[6]: tx_exhaustion_events: 0
[6]: tx_bytes: 6186413580299
[6]: tx_ucast_packets: 4819348468
[6]: tx_mcast_packets: 1755
[6]: tx_bcast_packets: 27814
[6]: tpa_aggregations: 0
[6]: tpa_aggregated_frames: 0
[6]: tpa_bytes: 0
[6]: driver_filtered_tx_pkt: 0
[7]: rx_bytes: 3957400032
[7]: rx_ucast_packets: 7706880
[7]: rx_mcast_packets: 439
[7]: rx_bcast_packets: 2078
[7]: rx_discards: 0
[7]: rx_phy_ip_err_discards: 0
[7]: rx_skb_alloc_discard: 0
[7]: rx_csum_offload_errors: 0
[7]: tx_exhaustion_events: 0
[7]: tx_bytes: 6382270562870
[7]: tx_ucast_packets: 4927082505
[7]: tx_mcast_packets: 1207
[7]: tx_bcast_packets: 0
[7]: tpa_aggregations: 0
[7]: tpa_aggregated_frames: 0
[7]: tpa_bytes: 0
[7]: driver_filtered_tx_pkt: 0
rx_bytes: 6286954089263
rx_error_bytes: 1176551
rx_ucast_packets: 15571295668
rx_mcast_packets: 5922
rx_bcast_packets: 5745119
rx_crc_errors: 0
rx_align_errors: 0
rx_undersize_packets: 0
rx_oversize_packets: 709
rx_fragments: 0
rx_jabbers: 0
rx_discards: 15041
rx_filtered_packets: 34473441
rx_mf_tag_discard: 0
pfc_frames_received: 0
pfc_frames_sent: 0
rx_brb_discard: 50179
rx_brb_truncate: 2570
rx_pause_frames: 0
rx_mac_ctrl_frames: 0
rx_constant_pause_events: 0
rx_phy_ip_err_discards: 0
rx_skb_alloc_discard: 0
rx_csum_offload_errors: 0
tx_exhaustion_events: 0
tx_bytes: 50868356937693
tx_error_bytes: 0
tx_ucast_packets: 39508942420
tx_mcast_packets: 11212
tx_bcast_packets: 32082
tx_mac_errors: 0
tx_carrier_errors: 0
tx_single_collisions: 0
tx_multi_collisions: 0
tx_deferred: 0
tx_excess_collisions: 0
tx_late_collisions: 0
tx_total_collisions: 0
tx_64_byte_packets: 112055708
tx_65_to_127_byte_packets: 2743786458
tx_128_to_255_byte_packets: 1274063534
tx_256_to_511_byte_packets: 675953588
tx_512_to_1023_byte_packets: 832925023
tx_1024_to_1522_byte_packets: 33870217710
tx_1523_to_9022_byte_packets: 0
tx_pause_frames: 8837
tpa_aggregations: 0
tpa_aggregated_frames: 0
tpa_bytes: 0
recoverable_errors: 0
unrecoverable_errors: 0
driver_filtered_tx_pkt: 0
Tx LPI entry count: 0
ptp_skipped_tx_tstamp: 0
[root@gw01]:~# ifconfig enp5s0f0
enp5s0f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 1.2.3.1 netmask 255.255.255.252 broadcast 1.2.3.3
inet6 fe80::f6e9:d4ff:fe95:98f0 prefixlen 64 scopeid 0x20<link>
ether f4:e9:d4:95:98:f0 txqueuelen 1000 (Ethernet)
RX packets 39878608868 bytes 50643967437038 (46.0 TiB)
RX errors 28719050 dropped 0 overruns 910 frame 28718140
TX packets 15907741385 bytes 6187264747768 (5.6 TiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 36 memory 0xd3000000-d37fffff
[root@gw01]:~# ip -s link show enp5s0f0
8: enp5s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether f4:e9:d4:95:98:f0 brd ff:ff:ff:ff:ff:ff
RX: bytes packets errors dropped overrun mcast
50644374239906 39878906616 28719967 0 28719057 0
TX: bytes packets errors dropped carrier collsns
6187303001480 15907840103 0 0 0 0
[root@gw01]:~# ifconfig enp5s0f1
enp5s0f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 fe80::f6e9:d4ff:fe95:98f2 prefixlen 64 scopeid 0x20<link>
ether f4:e9:d4:95:98:f2 txqueuelen 1000 (Ethernet)
RX packets 15575664787 bytes 6286503267405 (5.7 TiB)
RX errors 68499 dropped 0 overruns 15041 frame 53458
TX packets 39504799770 bytes 50862595134132 (46.2 TiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 48 memory 0xd4000000-d47fffff
[root@gw01]:~# ip -s link show enp5s0f1
11: enp5s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether f4:e9:d4:95:98:f2 brd ff:ff:ff:ff:ff:ff
RX: bytes packets errors dropped overrun mcast
6286530470849 15575793330 68499 0 52749 5920
TX: bytes packets errors dropped carrier collsns
50863147969521 39505197930 0 0 0 0