新的以太网双卡在执行 ifconfig eth2 down 时会导致服务器挂起

新的以太网双卡在执行 ifconfig eth2 down 时会导致服务器挂起

我要解释的是发生在两台具有相同操作系统、相同硬件和相同硬件升级的不同服务器上的情况。恕我直言,我认为可能是驱动程序错误,但不知道如何解决。

我在使用这台基于 SuperMicro 主板的服务器时遇到了一些奇怪的问题。
该服务器运行的是 Red Hat Linux。
当我执行“ifconfig eth2 down”时,服务器会“挂起”,eth3 也是如此。
此 eth2 和 etht3 属于上周添加的新 PCI 卡。Eth0 和
eth1 集成在主板上,它们与 igb 驱动程序一起工作。Eth2
和 eth3 是 PCI 卡上的新卡,依赖于 e1000e 驱动程序。

Eth0 配置如下且运行正常。

DEVICE=eth0
ONBOOT=yes
BOOTPROTO=none
IPADDR=10.0.16.49
NETMASK=255.255.255.0
NETWORK=10.0.16.0
HWADDR=00:xx:xx:xx:xx:5c

Eth1 配置如下且运行正常。

DEVICE=eth1
ONBOOT=yes
BOOTPROTO=none
IPADDR=192.168.16.46
NETMASK=255.255.255.0

eth2 和 eth3 已通过多种方式配置,但为了找出问题所在,我将它们(逐个)连接到具有 DHCP 的网络,然后调用 dhcpclient eth2 或 eth3,当 ifconfig down 时,计算机仍然挂起。所以在我看来,配置并不重要。

modprobe.conf 文件如下所示:

alias eth0 igb
alias eth1 igb
alias scsi_hostadapter ahci
install vtune_drv /opt/intel/vtune/mknod_vtune.sh
remove vtune_drv /opt/intel/vtune/rmnod_vtune.sh
alias char-major-10-111 mdm

igb 和 e1000e 模块已加载,我可以用 lsmod 看到它们。--
lsmod>http://pastebin.com/jJ7kk8mn

以太网上显示的内容lspci如下(前两个 eth 是 eth0 和 eth1)

01:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
01:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
03:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
03:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)

lspci-->http://pastebin.com/j94fWUPw
lspci -v-->http://pastebin.com/HRdMttzm

如果有人关心来自 dmidecode 的 bios 信息是:

Handle 0x0000, DMI type 0, 24 bytes.
BIOS Information
    Vendor: American Megatrends Inc.
    Version: R4222X52   
    Release Date: 09/23/2009
    Address: 0xF0000
    Runtime Size: 64 kB
    ROM Size: 4096 kB
    Characteristics:
        ISA is supported
        PCI is supported
        PNP is supported
        BIOS is upgradeable
        BIOS shadowing is allowed
        ESCD support is available
        Boot from CD is supported
        Selectable boot is supported
        BIOS ROM is socketed
        EDD is supported
        5.25"/1.2 MB floppy services are supported (int 13h)
        3.5"/720 KB floppy services are supported (int 13h)
        3.5"/2.88 MB floppy services are supported (int 13h)
        Print screen service is supported (int 5h)
        8042 keyboard services are supported (int 9h)
        Serial services are supported (int 14h)
        Printer services are supported (int 17h)
        CGA/mono video services are supported (int 10h)
        ACPI is supported
        USB legacy is supported
        LS-120 boot is supported
        ATAPI Zip drive boot is supported
        BIOS boot specification is supported
        Targeted content distribution is supported
    BIOS Revision: 8.15

从我的角度来看,boot.log 没有显示任何有趣的信息,但是,这里还有以下内容:

Aug  9 23:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK
Aug 10 00:00:01 s_sys@myserver45 IOCMDSTAT: CHECK
Aug 10 00:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK
Aug 10 01:00:01 s_sys@myserver45 IOCMDSTAT: CHECK
Aug 10 01:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK
Aug 10 02:00:02 s_sys@myserver45 IOCMDSTAT: CHECK
Aug 10 02:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK
Aug 10 03:00:02 s_sys@myserver45 IOCMDSTAT: CHECK
Aug 10 03:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK
Aug 10 04:00:02 s_sys@myserver45 IOCMDSTAT: CHECK
Aug 10 04:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK
Aug 10 05:00:01 s_sys@myserver45 IOCMDSTAT: CHECK
Aug 10 05:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK
Aug 10 06:00:01 s_sys@myserver45 IOCMDSTAT: CHECK
Aug 10 06:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK
Aug 10 07:00:01 s_sys@myserver45 IOCMDSTAT: CHECK
Aug 10 07:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK
Aug 10 08:00:01 s_sys@myserver45 IOCMDSTAT: CHECK
Aug 10 08:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK
Aug 10 09:00:01 s_sys@myserver45 IOCMDSTAT: CHECK
Aug 10 09:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK
Aug 10 10:00:01 s_sys@myserver45 IOCMDSTAT: CHECK
Aug 10 10:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK
Aug 10 11:00:01 s_sys@myserver45 IOCMDSTAT: CHECK
Aug 10 11:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK
Aug 10 11:08:37 s_sys@myserver45 NET[22300]: /sbin/dhclient-script : updated /etc/resolv.conf
Aug 10 11:15:29 s_sys@myserver45 IOSIGNAL: BOOT nb_io_adapters=1|nb_local_disks=2
Aug 10 11:15:29 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK

/var/log/消息 -->http://pastebin.com/wBQL1ESE

/var/log/内核/信息 -->http://pastebin.com/3KzF9Hh​​u

我不知道还有什么有用的东西,请告诉我。

答案1

您当前有一个 2.6.18 内核吗?

也许它也在遭受同样的问题

2.6.19 - 2.6.21(含)之间的内核的 MSI-X 问题


如果您在 2.6.19 和 2.6.21 之间的内核中使用 irqbalance,则可能会在任何 MSI-X 硬件上观察到内核崩溃和不稳定。如果遇到此类问题,您可以禁用 irqbalance 守护程序或升级到较新的内核。

这是来自英特尔最新的 e1000e 自述文件。因此请尝试禁用irqbalance

相关内容