我有一个运行 Fedora 15 的 IBM BladeCenter 刀片服务器。在 2.6.38 内核中一切正常,但当我使用 2.6.40 内核启动时,几秒钟后就失去了连接。卸下然后重新插入tg3
模块可恢复连接约五秒钟,之后再次失去连接。
这些是双 Xeon 刀片。它有两个 2.8GHz Xeon,HT 和 2GB RAM。该刀片服务器是运行 BIOS 版本 1.13 的 8832-L1X。的输出lspci
是:
00:00.0 Host bridge: Broadcom CMIC-LE Host Bridge (GC-LE chipset) (rev 33)
00:00.1 Host bridge: Broadcom CMIC-LE Host Bridge (GC-LE chipset)
00:00.2 Host bridge: Broadcom CMIC-LE Host Bridge (GC-LE chipset)
00:01.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:0f.0 Host bridge: Broadcom CSB6 South Bridge (rev b0)
00:0f.1 IDE interface: Broadcom CSB6 RAID/IDE Controller (rev b0)
00:0f.2 USB Controller: Broadcom CSB6 OHCI USB Controller (rev 05)
00:0f.3 ISA bridge: Broadcom GCLE-2 Host Bridge
00:10.0 Host bridge: Broadcom CIOB-E I/O Bridge with Gigabit Ethernet (rev 12)
00:10.2 Host bridge: Broadcom CIOB-E I/O Bridge with Gigabit Ethernet (rev 12)
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704S Gigabit Ethernet (rev 02)
01:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704S Gigabit Ethernet (rev 02)
这是dmesg
kernel-PAE-2.6.38.8-32.fc15.i686 的摘录(工作):
[11.545123] tg3.c:v3.116 (December 3, 2010)
[11.545152] tg3 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[11.599499] tg3 0000:01:00.0: eth0: Tigon3 [partno(BCM95704A41) rev 2002] (PCIX:133MHz:64-bit) MAC address
[11.599510] tg3 0000:01:00.0: eth0: attached PHY is serdes (1000Base-SX Ethernet) (WireSpeed[0])
[11.599518] tg3 0000:01:00.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[0]
[11.599525] tg3 0000:01:00.0: eth0: dma_rwctrl[769f4000] dma_mask[64-bit]
[11.599577] tg3 0000:01:00.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
[11.627997] tg3 0000:01:00.1: eth1: Tigon3 [partno(BCM95704A41) rev 2002] (PCIX:133MHz:64-bit) MAC address
[11.628066] tg3 0000:01:00.1: eth1: attached PHY is serdes (1000Base-SX Ethernet) (WireSpeed[0])
[11.628074] tg3 0000:01:00.1: eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
[11.628082] tg3 0000:01:00.1: eth1: dma_rwctrl[769f4000] dma_mask[64-bit]
[22.000286] tg3 0000:01:00.0: eth0: Link is up at 1000 Mbps, full duplex
[22.000294] tg3 0000:01:00.0: eth0: Flow control is off for TX and off for RX
这是 kernel-PAE-2.6.40.6-0.fc15.i686 中的一个(不起作用):
[10.262141] tg3.c:v3.119 (May 18, 2011)
[10.262177] tg3 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[10.309325] tg3 0000:01:00.0: eth0: Tigon3 [partno(BCM95704A41) rev 2002] (PCIX:133MHz:64-bit) MAC address
[10.309336] tg3 0000:01:00.0: eth0: attached PHY is serdes (1000Base-SX Ethernet) (WireSpeed[0], EEE[0])
[10.309344] tg3 0000:01:00.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
[10.309351] tg3 0000:01:00.0: eth0: dma_rwctrl[769f4000] dma_mask[64-bit]
[10.309431] tg3 0000:01:00.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
[10.361613] tg3 0000:01:00.1: eth1: Tigon3 [partno(BCM95704A41) rev 2002] (PCIX:133MHz:64-bit) MAC address
[10.361624] tg3 0000:01:00.1: eth1: attached PHY is serdes (1000Base-SX Ethernet) (WireSpeed[0], EEE[0])
[10.361633] tg3 0000:01:00.1: eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
[10.361640] tg3 0000:01:00.1: eth1: dma_rwctrl[769f4000] dma_mask[64-bit]
[21.054276] tg3 0000:01:00.0: eth0: Link is up at 1000 Mbps, full duplex
[21.054284] tg3 0000:01:00.0: eth0: Flow control is off for TX and off for RX
同样,症状是网络在几秒钟内工作正常,但随后完全停止。依次执行 armmod tg3
和 amodprobe tg3
可恢复连接几秒钟。任何日志中都没有出现任何异常情况。
我尝试了一些内核选项解决方法。我不记得具体是哪一个,但我知道我尝试过acpi=noirq
、、、和。acpi=ht
irqpoll
noapic
modinfo
,减去多alias
行显示:
filename: /lib/modules/2.6.40.6-0.fc15.i686.PAE/kernel/drivers/net/tg3.ko
firmware: tigon/tg3_tso5.bin
firmware: tigon/tg3_tso.bin
firmware: tigon/tg3.bin
version: 3.119
license: GPL
description: Broadcom Tigon3 ethernet driver
author: David S. Miller ([email protected]) and Jeff Garzik ([email protected])
srcversion: 389C3BA89E4ECF8460A74C0
depends:
vermagic: 2.6.40.6-0.fc15.i686.PAE SMP mod_unload 686
parm: tg3_debug:Tigon3 bitmapped debugging message enable value (int)
Fedora 报告内核版本为 2.6.40“与旧用户空间的兼容性”。
更新:ifconfig eth0 down
随后进行新的操作ifconfig
以将其启动并添加适当的路线使一切都可以无限期地工作。执行 anrmmod tg3
后跟 amodprobe tg3
会使它工作几秒钟,然后再次中断。所以至少我现在有一个解决方法——添加/etc/rc.d/init.d network restart
到rc.local
.我仍然想知道出了什么问题以及是否有适当的解决方案。