为什么 IGB 驱动程序在 Ubuntu18 上卡住了

为什么 IGB 驱动程序在 Ubuntu18 上卡住了
Feb  6 06:10:17 server3 kernel: [2846318.756928] ata9: SATA link down (SStatus 0 SControl 300)
Feb  6 06:10:17 server3 kernel: [2846319.496917] igb 0000:05:00.0 enp5s0: PCIe link lost
Feb  6 06:10:17 server3 kernel: [2846319.498118] ------------[ cut here ]------------
Feb  6 06:10:17 server3 kernel: [2846319.498121] igb: Failed to read reg 0xc030!
Feb  6 06:10:17 server3 kernel: [2846319.498221] WARNING: CPU: 1 PID: 1897 at drivers/net/ethernet/intel/igb/igb_main.c:747 igb_rd32.cold+0x3a/0x46 [igb]
Feb  6 06:10:17 server3 kernel: [2846319.498223] Modules linked in: cpuid ipmi_ssif intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp core$
Feb  6 06:10:17 server3 kernel: [2846319.498322] CPU: 1 PID: 1897 Comm: kworker/1:1 Tainted: G        W         5.10.1-051001-generic #202012142031
Feb  6 06:10:17 server3 kernel: [2846319.498325] Hardware name: ASUSTeK COMPUTER INC. TS500-E8-PS4 V2/Z10PA-D8 Series, BIOS 3208 12/09/2016
Feb  6 06:10:17 server3 kernel: [2846319.498336] Workqueue: events igb_watchdog_task [igb]
Feb  6 06:10:17 server3 kernel: [2846319.498350] RIP: 0010:igb_rd32.cold+0x3a/0x46 [igb]
Feb  6 06:10:17 server3 kernel: [2846319.498356] Code: c7 c6 1c 04 3b c0 e8 22 5f e5 ca 48 8b bb 30 ff ff ff e8 5f d7 88 ca 84 c0 74 16 44 89 ee 48 c7 c7 78 10 3b c0 e$
Feb  6 06:10:17 server3 kernel: [2846319.498358] RSP: 0018:ffff9f9b49743dd0 EFLAGS: 00010286
Feb  6 06:10:17 server3 kernel: [2846319.498362] RAX: 0000000000000000 RBX: ffff8f5089b34ed0 RCX: ffff8f51f7c58a48
Feb  6 06:10:17 server3 kernel: [2846319.498364] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff8f51f7c58a40
Feb  6 06:10:17 server3 kernel: [2846319.498366] RBP: ffff9f9b49743de8 R08: 0000000000000000 R09: ffff9f9b49743bb0
Feb  6 06:10:17 server3 kernel: [2846319.498368] R10: ffff9f9b49743ba8 R11: ffffffff8c32a6e8 R12: 00000000ffffffff
Feb  6 06:10:17 server3 kernel: [2846319.498370] R13: 000000000000c030 R14: 0000000000000000 R15: ffff8f5089ac1b40
Feb  6 06:10:17 server3 kernel: [2846319.498373] FS:  0000000000000000(0000) GS:ffff8f51f7c40000(0000) knlGS:0000000000000000
Feb  6 06:10:17 server3 kernel: [2846319.498375] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb  6 06:10:17 server3 kernel: [2846319.498377] CR2: 0000557ec1149d60 CR3: 000000035f810005 CR4: 00000000003706e0
Feb  6 06:10:17 server3 kernel: [2846319.498379] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000

我的服务器出现错误。网络已断开,请帮帮我

答案1

两个可能的原因:

  1. 电源硬件错误,可能来自较旧或有缺陷的 BIOS。如果将其用作pcie_port_pm=off启动参数,则应绕过它(但不能修复它)。如果有效,请将其添加到行GRUB_CMDLINE_LINUX_DEFAULT( sudo gedit /etc/default/grub)。它可能有副作用,因此请先从 grub 命令进行测试。有关如何执行此操作的更多信息

  2. 驱动程序错误。如果这是导致重新加载的原因,它应该可以工作

    modprobe -r igb
    sleep 1
    modprobe igb
    sleep 1
    systemctl restart network
    

    在这种情况下,您可能需要在启动过程的某个地方添加这些行。

相关内容