我有一台 IBM x3850 类型 8864 机器,我可以使用 2.6.32 内核成功启动,但是当我尝试使用 3.10 内核或更高版本时,内核无法初始化所有 PCI 插槽(我可以(手动)修复此问题,请参见下文) :
pci 0000:19:00.0: BAR 14: can't assign mem (size 0x1a00000)
pci 0000:19:00.0: BAR 13: can't assign io (size 0x3000)
pci 0000:19:00.0: BAR 14: can't assign mem (size 0x1600000)
pci 0000:19:00.0: BAR 13: can't assign io (size 0x3000)
pci 0000:1a:00.0: BAR 14: can't assign mem (size 0x1600000)
pci 0000:1a:00.0: BAR 13: assigned [io 0x7000-0x8fff]
pci 0000:1b:02.0: BAR 14: can't assign mem (size 0xa00000)
pci 0000:1b:04.0: BAR 14: can't assign mem (size 0xa00000)
pci 0000:1b:02.0: BAR 13: assigned [io 0x7000-0x7fff]
pci 0000:1b:04.0: BAR 13: assigned [io 0x8000-0x8fff]
...
这导致我的网卡没有成功加载,因为 PCI 总线显然没有正确实例化。
lspci
产生以下结果:
00:00.0 Host bridge: IBM Calgary PCI-X Host Bridge (rev 04)
00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV100 [Radeon 7000 / Radeon VE]
00:03.0 USB controller: NEC Corporation OHCI USB Controller (rev 43)
00:03.1 USB controller: NEC Corporation OHCI USB Controller (rev 43)
00:03.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 04)
00:0f.0 Host bridge: Broadcom CSB6 South Bridge (rev a0)
00:0f.1 IDE interface: Broadcom CSB6 RAID/IDE Controller (rev a0)
00:0f.3 ISA bridge: Broadcom GCLE-2 Host Bridge
01:00.0 Host bridge: IBM Calgary PCI-X Host Bridge (rev 04)
01:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 10)
01:01.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 10)
01:02.0 RAID bus controller: Adaptec AAC-RAID (rev 02)
02:00.0 Host bridge: IBM Calgary PCI-X Host Bridge (rev 04)
06:00.0 Host bridge: IBM Calgary PCI-X Host Bridge (rev 04)
0a:00.0 PCI bridge: IBM CalIOC2 PCI-E Root Port (rev 01)
0f:00.0 PCI bridge: IBM CalIOC2 PCI-E Root Port (rev 01)
14:00.0 PCI bridge: IBM CalIOC2 PCI-E Root Port (rev 01)
19:00.0 PCI bridge: IBM CalIOC2 PCI-E Root Port (rev 01)
1a:00.0 PCI bridge: Integrated Device Technology, Inc. [IDT] PES12N3A PCI Express Switch (rev 0c)
1b:02.0 PCI bridge: Integrated Device Technology, Inc. [IDT] PES12N3A PCI Express Switch (rev 0c)
1b:04.0 PCI bridge: Integrated Device Technology, Inc. [IDT] PES12N3A PCI Express Switch (rev 0c)
1c:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
1c:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
1d:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
1d:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
使固定
19:00.0 PCI bridge: IBM CalIOC2 PCI-E Root Port (rev 01)
实际上,我可以通过删除根 PCI 总线echo 1 > /sys/bus/pci/devices/0000\:19\:00.0/remove
并随后重新扫描来修复它:echo 1 > /sys/bus/pci/rescan
这会导致以下输出:
pci _bus 0000:1c: busn_res: [bus 1c] is released
pci _bus 0000:1d: busn_res: [bus 1d] is released
pci _bus 0000:1b: busn_res: [bus 1b-1d] is released
pci _bus 0000:1a: busn_res: [bus 1a-1d] is released
pci 0000:19:00.0: [1014:0308] type 01 class 0x060401
pci 0000:19:00.0: supports D1 D2
pci 0000:19:00.0: PME# supported from D0 D1 D2 D3hot D3cold
pci 0000:1a:00.0: [111d:8018] type 01 class 0x060400
pci 0000:1a:00.0: PME# supported from D0 D3hot D3cold
pci 0000:19:00.0: pci bridge to [bus 1a-1d] (subtractive decode)
pci 0000:19:00.0: bridge window [mem 0xea800000-0xea9fffff 64bit pref]
pci 0000:19:00.0: bridge window [mem 0xea800000-0xebcfffff] (subtractive decode)
pci 0000:19:00.0: bridge window [io 0x7000-0x8fff] (subtractive decode)
pci 0000:1b:02.0: [111d:8018] type 01 class 0x060400
pci 0000:1b:02.0: PME# supported from D0 D3hot D3cold
pci 0000:1b:04.0: [111d:8018] type 01 class 0x060400
pci 0000:1b:04.0: PME# supported from D0 D3hot D3cold
pci 0000:1a:00.0: pci bridge to [bus 1b-1d]
pci 0000:1a:00.0: bridge window [io 0x7000-0x8fff]
pci 0000:1a:00.0: bridge window [mem 0xea800000-0xea9fffff 64bit pref]
....
pci 0000:1b:04.0: bridge window [mem 0xea900000-0xea9fffff 64bit pref]
pci 0000:1b:02.0: bridge window [mem 0x00100000-0x001fffff 64bit pref] to [bus 1c] add_size 100000
pci 0000:1b:04.0: bridge window [mem 0x00100000-0x001fffff 64bit pref] to [bus 1d] add_size 100000
pci 0000:1b:02.0: res[15]=[mem 0x00100000-0x001fffff 64bit pref] get_res_add_size add_size 100000
pci 0000:1b:04.0: res[15]=[mem 0x00100000-0x001fffff 64bit pref] get_res_add_size add_size 100000
pci 0000:1a:00.0: bridge window [mem 0x00100000-0x002fffff 64bit pref] to [bus 1b-1d] add_size 200000
pci 0000:19:00.0: bridge window [io 0x1000-0x2fff] to [bus 1a-1d] add_size 1000
pci 0000:1a:00.0: res[15]=[mem 0x00100000-0x002fffff 64bit pref] get_res_add_size add_size 200000
pci 0000:19:00.0: bridge window [mem 0x00100000-0x002fffff 64bit pref] to [bus 1a-1d] add_size 200000
pci 0000:19:00.0: bridge window [mem 0x00200000-0x015fffff] to [bus 1a-1d] add_size 200000
pci 0000:19:00.0: res[14]=[mem 0x00200000-0x015fffff] get_res_add_size add_size 200000
pci 0000:19:00.0: res[15]=[mem 0x00100000-0x002fffff 64bit pref] get_res_add_size add_size 200000
pci 0000:19:00.0: res[13]=[io 0x1000-0x2fff] get_res_add_size add_size 1000
pci 0000:19:00.0: BAR 14: can't assign mem (size 0x1600000)
pci 0000:19:00.0: BAR 15: assigned [mem 0xea800000-0xeabfffff 64bit pref]
pci 0000:19:00.0: BAR 13: can't assign io (size 0x3000)
pci 0000:19:00.0: BAR 14: assigned [mem 0xea800000-0xebbfffff]
pci 0000:19:00.0: BAR 15: can't assign mem pref (size 0x200000)
pci 0000:19:00.0: BAR 13: assigned [io 0x7000-0x8fff]
pci 0000:19:00.0: BAR 14: can't assign mem (size 0x1400000)
pci 0000:19:00.0: failed to add 200000 res[14]=[mem 0xea800000-0xebbfffff]
pci 0000:19:00.0: BAR 13: can't assign io (size 0x2000)
pci 0000:19:00.0: failed to add 1000 res[13]=[io 0x7000-0x8fff]
pci 0000:1a:00.0: res[15]=[mem 0x00100000-0x002fffff 64bit pref] get_res_add_size add_size 200000
pci 0000:1a:00.0: BAR 14: assigned [mem 0xea800000-0xebbfffff]
pci 0000:1a:00.0: BAR 15: can't assign mem pref (size 0x400000)
pci 0000:1a:00.0: BAR 13: assigned [io 0x7000-0x8fff]
pci 0000:1a:00.0: BAR 14: assigned [mem 0xea800000-0xebbfffff]
pci 0000:1a:00.0: BAR 15: can't assign mem pref (size 0x200000)
pci 0000:1b:02.0: res[15]=[mem 0x00100000-0x001fffff 64bit pref] get_res_add_size add_size 100000
pci 0000:1b:04.0: res[15]=[mem 0x00100000-0x001fffff 64bit pref] get_res_add_size add_size 100000
pci 0000:1b:02.0: BAR 14: assigned [mem 0xea800000-0xeb1fffff]
pci 0000:1b:04.0: BAR 14: assigned [mem 0xeb200000-0xebbfffff]
pci 0000:1b:02.0: BAR 15: can't assign mem pref (size 0x200000)
pci 0000:1b:04.0: BAR 15: can't assign mem pref (size 0x200000)
pci 0000:1b:02.0: BAR 13: assigned [io 0x7000-0x7fff]
pci 0000:1b:04.0: BAR 13: assigned [io 0x8000-0x8fff]
pci 0000:1b:02.0: BAR 14: assigned [mem 0xea800000-0xeb1fffff]
pci 0000:1b:04.0: BAR 14: assigned [mem 0xeb200000-0xebbfffff]
pci 0000:1b:02.0: BAR 15: can't assign mem pref (size 0x100000)
pci 0000:1b:04.0: BAR 15: can't assign mem pref (size 0x100000)
pci 0000:1c:00.0: reg 184: [mem 0xea800000-0xea803fff 64bit pref]
pci 0000:1c:00.0: reg 190: [mem 0xea820000-0xea823fff 64bit pref]
pci 0000:1c:00.0: reg 184: [mem 0xea800000-0xea803fff 64bit pref]
pci 0000:1c:00.0: reg 184: [mem 0xea800000-0xea803fff 64bit pref]
pci 0000:1c:00.0: reg 190: [mem 0xea820000-0xea823fff 64bit pref]
pci 0000:1c:00.0: reg 184: [mem 0xea800000-0xea803fff 64bit pref]
pci 0000:1c:00.0: reg 190: [mem 0xea820000-0xea823fff 64bit pref]
pci 0000:1c:00.1: reg 184: [mem 0xea840000-0xea843fff 64bit pref]
pci 0000:1c:00.0: reg 184: [mem 0xea800000-0xea803fff 64bit pref]
pci 0000:1c:00.0: reg 190: [mem 0xea820000-0xea823fff 64bit pref]
pci 0000:1c:00.1: reg 190: [mem 0xea860000-0xea863fff 64bit pref]
pci 0000:1c:00.0: reg 184: [mem 0xea800000-0xea803fff 64bit pref]
pci 0000:1c:00.0: reg 190: [mem 0xea820000-0xea823fff 64bit pref]
pci 0000:1c:00.1: reg 184: [mem 0xea840000-0xea843fff 64bit pref]
pci 0000:1c:00.0: res[7]=[mem 0xea800000-0xea7fffff 64bit pref] get_res_add_size add_size 20000
pci 0000:1c:00.0: res[10]=[mem 0xea820000-0xea81ffff 64bit pref] get_res_add_size add_size 20000
pci 0000:1c:00.1: res[7]=[mem 0xea840000-0xea83ffff 64bit pref] get_res_add_size add_size 20000
pci 0000:1c:00.1: res[10]=[mem 0xea860000-0xea85ffff 64bit pref] get_res_add_size add_size 20000
pci 0000:1c:00.0: BAR 1: assigned [mem 0xea800000-0xeabfffff]
pci 0000:1c:00.1: BAR 1: assigned [mem 0xeac00000-0xeaffffff]
pci 0000:1c:00.0: BAR 0: assigned [mem 0xeb000000-0xeb01ffff]
pci 0000:1c:00.1: BAR 0: assigned [mem 0xeb020000-0xeb03ffff]
pci 0000:1c:00.0: BAR 3: assigned [mem 0xeb040000-0xeb043fff]
pci 0000:1c:00.0: reg 184: [mem 0xea800000-0xea803fff 64bit pref]
pci 0000:1c:00.0: BAR 7: assigned [mem 0xeb044000-0xeb063fff 64bit pref]
pci 0000:1c:00.0: reg 190: [mem 0xea820000-0xea823fff 64bit pref]
pci 0000:1c:00.0: BAR 10: assigned [mem 0xeb064000-0xeb083fff 64bit pref]
pci 0000:1c:00.1: BAR 3: assigned [mem 0xeb084000-0xeb087fff]
pci 0000:1c:00.1: reg 184: [mem 0xea840000-0xea843fff 64bit pref]
pci 0000:1c:00.1: BAR 7: assigned [mem 0xeb088000-0xeb0a7fff 64bit pref]
pci 0000:1c:00.1: reg 190: [mem 0xea860000-0xea863fff 64bit pref]
pci 0000:1c:00.1: BAR 10: assigned [mem 0xeb0a8000-0xeb0c7fff 64bit pref]
pci 0000:1c:00.0: BAR 2: assigned [io 0x7000-0x701f]
pci 0000:1c:00.1: BAR 2: assigned [io 0x7020-0x703f]
pci 0000:1b:02.0: pci bridge to [bus 1c]
pci 0000:1b:02.0: bridge window [io 0x7000-0x7fff]
pci 0000:1b:02.0: bridge window [mem 0xea800000-0xeb1fffff]
pci 0000:1d:00.0: reg 184: [mem 0xea900000-0xea903fff 64bit pref]
pci 0000:1d:00.0: reg 190: [mem 0xea920000-0xea923fff 64bit pref]
pci 0000:1d:00.0: reg 184: [mem 0xea900000-0xea903fff 64bit pref]
pci 0000:1d:00.0: reg 184: [mem 0xea900000-0xea903fff 64bit pref]
pci 0000:1d:00.0: reg 190: [mem 0xea920000-0xea923fff 64bit pref]
pci 0000:1d:00.0: reg 184: [mem 0xea900000-0xea903fff 64bit pref]
pci 0000:1d:00.0: reg 190: [mem 0xea920000-0xea923fff 64bit pref]
pci 0000:1d:00.1: reg 184: [mem 0xea940000-0xea943fff 64bit pref]
pci 0000:1d:00.0: reg 184: [mem 0xea900000-0xea903fff 64bit pref]
pci 0000:1d:00.0: reg 190: [mem 0xea920000-0xea923fff 64bit pref]
pci 0000:1d:00.1: reg 190: [mem 0xea960000-0xea963fff 64bit pref]
pci 0000:1d:00.0: reg 184: [mem 0xea900000-0xea903fff 64bit pref]
pci 0000:1d:00.0: reg 190: [mem 0xea920000-0xea923fff 64bit pref]
pci 0000:1d:00.1: reg 184: [mem 0xea940000-0xea943fff 64bit pref]
pci 0000:1d:00.0: res[7]=[mem 0xea900000-0xea8fffff 64bit pref] get_res_add_size add_size 20000
pci 0000:1d:00.0: res[10]=[mem 0xea920000-0xea91ffff 64bit pref] get_res_add_size add_size 20000
pci 0000:1d:00.1: res[7]=[mem 0xea940000-0xea93ffff 64bit pref] get_res_add_size add_size 20000
pci 0000:1d:00.1: res[10]=[mem 0xea960000-0xea95ffff 64bit pref] get_res_add_size add_size 20000
pci 0000:1d:00.0: BAR 1: assigned [mem 0xeb400000-0xeb7fffff]
pci 0000:1d:00.1: BAR 1: assigned [mem 0xeb800000-0xebbfffff]
pci 0000:1d:00.0: BAR 0: assigned [mem 0xeb200000-0xeb21ffff]
pci 0000:1d:00.1: BAR 0: assigned [mem 0xeb220000-0xeb23ffff]
pci 0000:1d:00.0: BAR 3: assigned [mem 0xeb240000-0xeb243fff]
pci 0000:1d:00.0: reg 184: [mem 0xea900000-0xea903fff 64bit pref]
pci 0000:1d:00.0: BAR 7: assigned [mem 0xeb244000-0xeb263fff 64bit pref]
pci 0000:1d:00.0: reg 190: [mem 0xea920000-0xea923fff 64bit pref]
pci 0000:1d:00.0: BAR 10: assigned [mem 0xeb264000-0xeb283fff 64bit pref]
pci 0000:1d:00.1: BAR 3: assigned [mem 0xeb284000-0xeb287fff]
pci 0000:1d:00.1: reg 184: [mem 0xea940000-0xea943fff 64bit pref]
pci 0000:1d:00.1: BAR 7: assigned [mem 0xeb288000-0xeb2a7fff 64bit pref]
pci 0000:1d:00.1: reg 190: [mem 0xea960000-0xea963fff 64bit pref]
pci 0000:1d:00.1: BAR 10: assigned [mem 0xeb2a8000-0xeb2c7fff 64bit pref]
pci 0000:1d:00.0: BAR 2: assigned [io 0x8000-0x801f]
pci 0000:1d:00.1: BAR 2: assigned [io 0x8020-0x803f]
pci 0000:1b:04.0: pci bridge to [bus 1d]
pci 0000:1b:04.0: bridge window [io 0x8000-0x8fff]
pci 0000:1b:04.0: bridge window [mem 0xeb200000-0xebbfffff]
pci 0000:1a:00.0: pci bridge to [bus 1b-1d]
pci 0000:1a:00.0: bridge window [io 0x7000-0x8fff]
pci 0000:1a:00.0: bridge window [mem 0xea800000-0xebbfffff]
pci 0000:19:00.0: pci bridge to [bus 1a-1d]
pci 0000:19:00.0: bridge window [io 0x7000-0x8fff]
pci 0000:19:00.0: bridge window [mem 0xea800000-0xebbfffff]
问题
是否可以以某种方式告诉内核(例如通过参数)自动执行此操作?首先是什么导致了这个问题?
先感谢您!
更新
由于所描述的修复在 4.x 系统上失败(实际上我想是从 3.12 开始),我查看了内核,发现如果禁用 PCI ASPM(已被 ACPI 禁用(t 也可以由 pcie_aspm 强制) =off 在内核启动参数中)),以下小修复(在 4.4.0 上)解决了内核空指针取消引用:
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -552,11 +552,12 @@ static struct pcie_link_state *alloc_pcie_link_state(struct pci_dev *pdev)
void pcie_aspm_init_link_state(struct pci_dev *pdev)
{
struct pcie_link_state *link;
- int blacklist = !!pcie_aspm_sanity_check(pdev);
-
+ int blacklist;
if (!aspm_support_enabled)
return;
+ blacklist = !!pcie_aspm_sanity_check(pdev);
+
if (pdev->link_state)
return;
有点奇怪的是,如果功能本身被停用,则会执行健全性检查,实际的空指针取消引用发生在pcie_aspm_sanity_check
这一行中list_for_each_entry(child, &pdev->subordinate->devices, bus_list) {
。这是内核错误吗?
答案1
是否可以以某种方式告诉内核(例如通过参数)自动执行此操作?
尝试启动,pci=realloc
我们在古老的主板上热插拔 NVMe 驱动器时遇到了类似的问题,直到我们传入该参数。