我遇到了一个问题,在升级到 Ubuntu 12.04 后,我的机器开始丢包,没有任何系统负载或高中断使用率的迹象。我的服务器是一个网络监控传感器,运行 Ubuntu LTS 12.04,它被动地从 5 个接口收集数据包,进行网络入侵类型的操作。在升级之前,我设法每天收集 200+GB 的数据包,并将它们写入磁盘,在 CPU 亲和性和 NIC IRQ 到 CPU 绑定的帮助下,数据包丢失率约为 0%,具体取决于当天的情况。现在我丢失了大量数据包,我的应用程序都没有运行,而且 PPS 速率非常低,而现代工作站 NIC 不会遇到任何问题。
规格:x64 Xeon 4 核 3.2 Ghz 16 GB RAM NIC:5 个使用 e1000 驱动程序(NAPI)的 Intel Pro NIC。[1] eth0 和 eth1 是集成 NIC(在主板中)还有另外 2 个 PCI-X 网卡,每个网卡有 2 个以太网端口。
其中 3 个接口在千兆以太网上运行,其他接口则没有,因为它们连接到集线器。
产品规格: [2]http://support.dell.com/support/edocs/systems/pe2850/en/ug/t1390aa.htm
uptime
17:36:00 up 1:43, 2 users, load average: 0.00, 0.01, 0.05
# uname -a
Linux nms 3.2.0-29-generic #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
我还将 CPU 调节器设置为性能模式,并关闭 irqbalance。打开它们后问题仍然存在。
# lspci -t -vv
-[0000:00]-+-00.0 Intel Corporation E7520 Memory Controller Hub
+-02.0-[01-03]--+-00.0-[02]----0e.0 Dell PowerEdge Expandable RAID controller 4
| \-00.2-[03]--
+-04.0-[04]--
+-05.0-[05-07]--+-00.0-[06]----07.0 Intel Corporation 82541GI Gigabit Ethernet Controller
| \-00.2-[07]----08.0 Intel Corporation 82541GI Gigabit Ethernet Controller
+-06.0-[08-0a]--+-00.0-[09]--+-04.0 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper)
| | \-04.1 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper)
| \-00.2-[0a]--+-02.0 Digium, Inc. Wildcard TE210P/TE212P dual-span T1/E1/J1 card 3.3V
| +-03.0 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper)
| \-03.1 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper)
+-1d.0 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1
+-1d.1 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2
+-1d.2 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #3
+-1d.7 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller
+-1e.0-[0b]----0d.0 Advanced Micro Devices [AMD] nee ATI RV100 QY [Radeon 7000/VE]
+-1f.0 Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge
\-1f.1 Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE Controller
我认为 NIC 和 NIC 驱动程序都会丢弃数据包,因为 ethtool 在每个接口的 rx_missed_errors 和 rx_no_buffer_count 下都报告 0。在旧系统上,如果无法跟上,就会出现丢包。我几乎每秒都会在多个接口上丢弃数据包,通常以 2-4 的小增量丢弃。
我尝试了所有这些 sysctl 值,我目前正在使用未注释的值。
# cat /etc/sysctl.conf
# high
net.core.netdev_max_backlog = 3000000
net.core.rmem_max = 16000000
net.core.rmem_default = 8000000
# defaults
#net.core.netdev_max_backlog = 1000
#net.core.rmem_max = 131071
#net.core.rmem_default = 163480
# moderate
#net.core.netdev_max_backlog = 10000
#net.core.rmem_max = 33554432
#net.core.rmem_default = 33554432
以下是使用 ethtool 的接口统计报告示例。它们都是一样的,没有什么不寻常的(我认为),所以我只展示一个:
ethtool -S eth2
NIC statistics:
rx_packets: 7498
tx_packets: 0
rx_bytes: 2722585
tx_bytes: 0
rx_broadcast: 327
tx_broadcast: 0
rx_multicast: 1504
tx_multicast: 0
rx_errors: 0
tx_errors: 0
tx_dropped: 0
multicast: 1504
collisions: 0
rx_length_errors: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
rx_no_buffer_count: 0
rx_missed_errors: 0
tx_aborted_errors: 0
tx_carrier_errors: 0
tx_fifo_errors: 0
tx_heartbeat_errors: 0
tx_window_errors: 0
tx_abort_late_coll: 0
tx_deferred_ok: 0
tx_single_coll_ok: 0
tx_multi_coll_ok: 0
tx_timeout_count: 0
tx_restart_queue: 0
rx_long_length_errors: 0
rx_short_length_errors: 0
rx_align_errors: 0
tx_tcp_seg_good: 0
tx_tcp_seg_failed: 0
rx_flow_control_xon: 0
rx_flow_control_xoff: 0
tx_flow_control_xon: 0
tx_flow_control_xoff: 0
rx_long_byte_count: 2722585
rx_csum_offload_good: 0
rx_csum_offload_errors: 0
alloc_rx_buff_failed: 0
tx_smbus: 0
rx_smbus: 0
dropped_smbus: 01
# ifconfig
eth0 Link encap:Ethernet HWaddr 00:11:43:e0:e2:8c
UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1
RX packets:373348 errors:16 dropped:95 overruns:0 frame:16
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:356830572 (356.8 MB) TX bytes:0 (0.0 B)
eth1 Link encap:Ethernet HWaddr 00:11:43:e0:e2:8d
UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1
RX packets:13616 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:8690528 (8.6 MB) TX bytes:0 (0.0 B)
eth2 Link encap:Ethernet HWaddr 00:04:23:e1:77:6a
UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1
RX packets:7750 errors:0 dropped:471 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2780935 (2.7 MB) TX bytes:0 (0.0 B)
eth3 Link encap:Ethernet HWaddr 00:04:23:e1:77:6b
UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1
RX packets:5112 errors:0 dropped:206 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:639472 (639.4 KB) TX bytes:0 (0.0 B)
eth4 Link encap:Ethernet HWaddr 00:04:23:b6:35:6c
UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1
RX packets:961467 errors:0 dropped:935 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:958561305 (958.5 MB) TX bytes:0 (0.0 B)
eth5 Link encap:Ethernet HWaddr 00:04:23:b6:35:6d
inet addr:192.168.1.6 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:4264 errors:0 dropped:16 overruns:0 frame:0
TX packets:699 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:572228 (572.2 KB) TX bytes:124456 (124.4 KB)
我尝试了默认设置,然后开始尝试各种设置。我没有使用任何流量控制,并且在升级之前将 RxDescriptor 数量增加到 4096,没有任何问题。
# cat /etc/modprobe.d/e1000.conf
options e1000 XsumRX=0,0,0,0,0 RxDescriptors=4096,4096,4096,4096,4096 FlowControl=0,0,0,0,0 debug=16
这是我的网络配置文件,我关闭了校验和和各种卸载机制,并设置了 CPU 亲和性,使用率高的接口占用整个 CPU,使用率低的接口共享一个 CPU。我在升级之前使用了这些设置,没有出现问题。
# cat /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback
# The primary network interface
auto eth0
iface eth0 inet manual
pre-up /sbin/ethtool -G eth0 rx 4096 tx 0
pre-up /sbin/ethtool -K eth0 gro off gso off rx off
pre-up /sbin/ethtool -A eth0 rx off autoneg off
up ifconfig eth0 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up
post-up echo "4" > /proc/irq/48/smp_affinity
down ifconfig eth0 down
post-down /sbin/ethtool -G eth0 rx 256 tx 256
post-down /sbin/ethtool -K eth0 gro on gso on rx on
post-down /sbin/ethtool -A eth0 rx on autoneg on
auto eth1
iface eth1 inet manual
pre-up /sbin/ethtool -G eth1 rx 4096 tx 0
pre-up /sbin/ethtool -K eth1 gro off gso off rx off
pre-up /sbin/ethtool -A eth1 rx off autoneg off
up ifconfig eth1 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up
post-up echo "4" > /proc/irq/49/smp_affinity
down ifconfig eth1 down
post-down /sbin/ethtool -G eth1 rx 256 tx 256
post-down /sbin/ethtool -K eth1 gro on gso on rx on
post-down /sbin/ethtool -A eth1 rx on autoneg on
auto eth2
iface eth2 inet manual
pre-up /sbin/ethtool -G eth2 rx 4096 tx 0
pre-up /sbin/ethtool -K eth2 gro off gso off rx off
pre-up /sbin/ethtool -A eth2 rx off autoneg off
up ifconfig eth2 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up
post-up echo "1" > /proc/irq/82/smp_affinity
down ifconfig eth2 down
post-down /sbin/ethtool -G eth2 rx 256 tx 256
post-down /sbin/ethtool -K eth2 gro on gso on rx on
post-down /sbin/ethtool -A eth2 rx on autoneg on
auto eth3
iface eth3 inet manual
pre-up /sbin/ethtool -G eth3 rx 4096 tx 0
pre-up /sbin/ethtool -K eth3 gro off gso off rx off
pre-up /sbin/ethtool -A eth3 rx off autoneg off
up ifconfig eth3 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up
post-up echo "2" > /proc/irq/83/smp_affinity
down ifconfig eth3 down
post-down /sbin/ethtool -G eth3 rx 256 tx 256
post-down /sbin/ethtool -K eth3 gro on gso on rx on
post-down /sbin/ethtool -A eth3 rx on autoneg on
auto eth4
iface eth4 inet manual
pre-up /sbin/ethtool -G eth4 rx 4096 tx 0
pre-up /sbin/ethtool -K eth4 gro off gso off rx off
pre-up /sbin/ethtool -A eth4 rx off autoneg off
up ifconfig eth4 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up
post-up echo "4" > /proc/irq/77/smp_affinity
down ifconfig eth4 down
post-down /sbin/ethtool -G eth4 rx 256 tx 256
post-down /sbin/ethtool -K eth4 gro on gso on rx on
post-down /sbin/ethtool -A eth4 rx on autoneg on
auto eth5
iface eth5 inet static
pre-up /etc/fw.conf
address 192.168.1.1
netmask 255.255.255.0
broadcast 192.168.1.255
gateway 192.168.1.1
dns-nameservers 192.168.1.2 192.168.1.3
up ifconfig eth5 up
post-up echo "8" > /proc/irq/77/smp_affinity
down ifconfig eth5 down
以下是几个丢包示例,我一个接一个地运行,总共可能持续 3 或 4 秒。您可以看到从第 1 次到第 3 次丢包次数增加。这是一个非繁忙时间,流量很少。
# awk '{ print $1,$5 }' /proc/net/dev
Inter-|
face drop
eth3: 225
lo: 0
eth2: 505
eth1: 0
eth5: 17
eth0: 105
eth4: 1034
# awk '{ print $1,$5 }' /proc/net/dev
Inter-|
face drop
eth3: 225
lo: 0
eth2: 507
eth1: 0
eth5: 17
eth0: 105
eth4: 1034
# awk '{ print $1,$5 }' /proc/net/dev
Inter-|
face drop
eth3: 227
lo: 0
eth2: 512
eth1: 0
eth5: 17
eth0: 105
eth4: 1039
我尝试了 pci=noacpi 选项。有和没有,结果都一样。这是我升级前的中断统计数据,升级后,使用 PCI 上的 ACPI 时,显示多个 NIC 绑定到一个中断并与其他设备(如 USB 驱动器)共享,我不喜欢这种情况,所以我想我会保持 ACPI 关闭,因为这样更容易指定唯一目的中断。使用默认设置(即 ACPI w/PCI)有什么好处吗?
# cat /etc/default/grub | grep CMD_LINE
GRUB_CMDLINE_LINUX_DEFAULT="ipv6.disable=1 noacpi pci=noacpi"
GRUB_CMDLINE_LINUX=""
# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
0: 45 0 0 16 IO-APIC-edge timer
1: 1 0 0 7936 IO-APIC-edge i8042
2: 0 0 0 0 XT-PIC-XT-PIC cascade
6: 0 0 0 3 IO-APIC-edge floppy
8: 0 0 0 1 IO-APIC-edge rtc0
9: 0 0 0 0 IO-APIC-edge acpi
12: 0 0 0 1809 IO-APIC-edge i8042
14: 1 0 0 4498 IO-APIC-edge ata_piix
15: 0 0 0 0 IO-APIC-edge ata_piix
16: 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb2
18: 0 0 0 1350 IO-APIC-fasteoi uhci_hcd:usb4, radeon
19: 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb3
23: 0 0 0 4099 IO-APIC-fasteoi ehci_hcd:usb1
38: 0 0 0 61963 IO-APIC-fasteoi megaraid
48: 0 0 1002319 4 IO-APIC-fasteoi eth0
49: 0 0 38772 3 IO-APIC-fasteoi eth1
77: 0 0 130076 432159 IO-APIC-fasteoi eth4
78: 0 0 0 23917 IO-APIC-fasteoi eth5
82: 1329033 0 0 4 IO-APIC-fasteoi eth2
83: 0 4886525 0 6 IO-APIC-fasteoi eth3
NMI: 5 6 4 5 Non-maskable interrupts
LOC: 61409 57076 64257 114764 Local timer interrupts
SPU: 0 0 0 0 Spurious interrupts
IWI: 0 0 0 0 IRQ work interrupts
RES: 17956 25333 13436 14789 Rescheduling interrupts
CAL: 22436 607 539 478 Function call interrupts
TLB: 1525 1458 4600 4151 TLB shootdowns
TRM: 0 0 0 0 Thermal event interrupts
THR: 0 0 0 0 Threshold APIC interrupts
MCE: 0 0 0 0 Machine check exceptions
MCP: 16 16 16 16 Machine check polls
ERR: 0
MIS: 0
这是 vmstat 的示例输出,显示了系统。目前是准系统。
root@nms:~# vmstat -S m 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 14992 192 1029 0 0 56 2 419 29 1 0 99 0
0 0 0 14992 192 1029 0 0 0 0 922 27 0 0 100 0
0 0 0 14991 192 1029 0 0 0 36 763 50 0 0 100 0
0 0 0 14991 192 1029 0 0 0 0 646 35 0 0 100 0
0 0 0 14991 192 1029 0 0 0 0 722 54 0 0 100 0
0 0 0 14991 192 1029 0 0 0 0 793 27 0 0 100 0
^C
这是 dmesg 的输出。我不明白为什么我的 PCI-X 插槽被协商为 PCI。除了服务器附带的集成网卡外,所有网卡都是 PCI-X。在下面的输出中,eth3 和 eth2 似乎以 PCI-X 速度而不是 PCI:66Mhz 进行协商。它们不是都降到 PCI:66Mhz 吗?如果您的集成网卡是 PCI,如下所示 (eth0,eth1),那么您总线上的所有设备速度不是都会降到较慢的总线速度吗?如果不是,我仍然不知道为什么在下面的输出中只有一个网卡(每个都有两个以太网端口)被标记为 PCI-X。这是否意味着它以 PCI-X 速度运行,是否表明它有能力?
# dmesg | grep e1000
[ 3678.349337] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
[ 3678.349342] e1000: Copyright (c) 1999-2006 Intel Corporation.
[ 3678.349394] e1000 0000:06:07.0: PCI->APIC IRQ transform: INT A -> IRQ 48
[ 3678.409725] e1000 0000:06:07.0: Receive Descriptors set to 4096
[ 3678.409730] e1000 0000:06:07.0: Checksum Offload Disabled
[ 3678.409734] e1000 0000:06:07.0: Flow Control Disabled
[ 3678.586409] e1000 0000:06:07.0: eth0: (PCI:66MHz:32-bit) 00:11:43:e0:e2:8c
[ 3678.586419] e1000 0000:06:07.0: eth0: Intel(R) PRO/1000 Network Connection
[ 3678.586642] e1000 0000:07:08.0: PCI->APIC IRQ transform: INT A -> IRQ 49
[ 3678.649854] e1000 0000:07:08.0: Receive Descriptors set to 4096
[ 3678.649859] e1000 0000:07:08.0: Checksum Offload Disabled
[ 3678.649863] e1000 0000:07:08.0: Flow Control Disabled
[ 3678.826436] e1000 0000:07:08.0: eth1: (PCI:66MHz:32-bit) 00:11:43:e0:e2:8d
[ 3678.826444] e1000 0000:07:08.0: eth1: Intel(R) PRO/1000 Network Connection
[ 3678.826627] e1000 0000:09:04.0: PCI->APIC IRQ transform: INT A -> IRQ 82
[ 3679.093266] e1000 0000:09:04.0: Receive Descriptors set to 4096
[ 3679.093271] e1000 0000:09:04.0: Checksum Offload Disabled
[ 3679.093275] e1000 0000:09:04.0: Flow Control Disabled
[ 3679.130239] e1000 0000:09:04.0: eth2: (PCI-X:133MHz:64-bit) 00:04:23:e1:77:6a
[ 3679.130246] e1000 0000:09:04.0: eth2: Intel(R) PRO/1000 Network Connection
[ 3679.130449] e1000 0000:09:04.1: PCI->APIC IRQ transform: INT B -> IRQ 83
[ 3679.397312] e1000 0000:09:04.1: Receive Descriptors set to 4096
[ 3679.397318] e1000 0000:09:04.1: Checksum Offload Disabled
[ 3679.397321] e1000 0000:09:04.1: Flow Control Disabled
[ 3679.434350] e1000 0000:09:04.1: eth3: (PCI-X:133MHz:64-bit) 00:04:23:e1:77:6b
[ 3679.434360] e1000 0000:09:04.1: eth3: Intel(R) PRO/1000 Network Connection
[ 3679.434553] e1000 0000:0a:03.0: PCI->APIC IRQ transform: INT A -> IRQ 77
[ 3679.704072] e1000 0000:0a:03.0: Receive Descriptors set to 4096
[ 3679.704077] e1000 0000:0a:03.0: Checksum Offload Disabled
[ 3679.704081] e1000 0000:0a:03.0: Flow Control Disabled
[ 3679.738364] e1000 0000:0a:03.0: eth4: (PCI:33MHz:64-bit) 00:04:23:b6:35:6c
[ 3679.738371] e1000 0000:0a:03.0: eth4: Intel(R) PRO/1000 Network Connection
[ 3679.738538] e1000 0000:0a:03.1: PCI->APIC IRQ transform: INT B -> IRQ 78
[ 3680.046060] e1000 0000:0a:03.1: eth5: (PCI:33MHz:64-bit) 00:04:23:b6:35:6d
[ 3680.046067] e1000 0000:0a:03.1: eth5: Intel(R) PRO/1000 Network Connection
[ 3682.132415] e1000: eth0 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None
[ 3682.224423] e1000: eth1 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None
[ 3682.316385] e1000: eth2 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None
[ 3682.408391] e1000: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[ 3682.500396] e1000: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[ 3682.708401] e1000: eth5 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
起初我以为是 NIC 驱动程序,但我不太确定。我现在真的不知道还能去哪里查找。
我很感激任何帮助,因为我正在努力解决这个问题。如果您需要更多信息,请直接询问。
谢谢!
[1]http://www.cs.fsu.edu/~baker/devices/lxr/http/source/linux/Documentation/networking/e1000.txt?v=2.6.11.8 [2]http://support.dell.com/support/edocs/systems/pe2850/en/ug/t1390aa.htm
答案1
我将一台笔记本电脑插入每个端口,并使用 hping3 --udp --flood -p 53 $ip 以 30,000 PPS 的速率发送。通过一次尝试单个 NIC,我发现损失为 0%。我尝试了多种组合,并将问题归结为一个连接了集线器的 NIC。更换电缆后,问题仍然存在。如果集线器未连接到该接口,则一切正常。
另外,关于 PCI 问题。我从系统中移除了旧的 PCI 星号卡,现在扩展 NIC 卡以 PCI-X 133Mhz 运行。两个集成 NIC 以 PCI 33Mhz 运行。
我仍然不确定从数据包捕获的角度来看,使用以下选项启动内核是否存在任何缺点:noacpi pci=noacpi
答案2
有时有线电视公司的“节点”上会有噪音...检查一下..刚刚发生在我们有线电视公司的事情是,不可预测地出现奇怪的丢包问题...如果你更换了电脑/网卡但问题没有解决,新的 cat 5/电源线也没有解决,这很可能是问题所在。