我的系统运行了一段时间,然后最终变成“只读文件系统”。修复它的唯一方法是重新启动计算机,并希望它允许我这样做fsck
而不会陷入某种内核恐慌循环,我只能祈祷下一次重新启动可以解决。就在它发生之前,我收到了一些这样的消息:
[ 66.065416] xhci_hcd 0000:3e:00.0: PCI post-resume error -19!
[ 66.065425] xhci_hcd 0000:3e:00.0: HC died; cleaning up
[ 66.065571] xhci_hcd 0000:3e:00.0: remove, state 4
[ 66.065579] usb usb4: USB disconnect, device number 1
[ 66.066065] xhci_hcd 0000:3e:00.0: USB bus 4 deregistered
[ 66.066075] xhci_hcd 0000:3e:00.0: remove, state 4
[ 66.066080] usb usb3: USB disconnect, device number 1
[ 66.066928] xhci_hcd 0000:3e:00.0: Host halt failed, -19
[ 66.066933] xhci_hcd 0000:3e:00.0: Host not accessible, reset failed.
[ 66.067109] xhci_hcd 0000:3e:00.0: USB bus 3 deregistered
[ 66.613768] pci_bus 0000:07: Allocating resources
[ 66.613793] pcieport 0000:07:01.0: bridge window [io 0x1000-0x0fff] to [bus 09-3d] add_size 1000
[ 66.613797] pcieport 0000:07:02.0: bridge window [io 0x1000-0x0fff] to [bus 3e] add_size 1000
[ 66.613802] pcieport 0000:07:02.0: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 3e] add_size 200000 add_align 100000
[ 66.613807] pcieport 0000:06:00.0: bridge window [io 0x1000-0x0fff] to [bus 07-3e] add_size 3000
[ 66.613815] pcieport 0000:06:00.0: BAR 13: no space for [io size 0x3000]
[ 66.613817] pcieport 0000:06:00.0: BAR 13: failed to assign [io size 0x3000]
[ 66.613821] pcieport 0000:06:00.0: BAR 13: no space for [io size 0x3000]
[ 66.613824] pcieport 0000:06:00.0: BAR 13: failed to assign [io size 0x3000]
[ 66.613834] pcieport 0000:07:02.0: BAR 15: no space for [mem size 0x00200000 64bit pref]
[ 66.613837] pcieport 0000:07:02.0: BAR 15: failed to assign [mem size 0x00200000 64bit pref]
[ 66.613840] pcieport 0000:07:01.0: BAR 13: no space for [io size 0x1000]
[ 66.613842] pcieport 0000:07:01.0: BAR 13: failed to assign [io size 0x1000]
[ 66.613845] pcieport 0000:07:02.0: BAR 13: no space for [io size 0x1000]
[ 66.613847] pcieport 0000:07:02.0: BAR 13: failed to assign [io size 0x1000]
[ 66.613853] pcieport 0000:07:02.0: BAR 15: no space for [mem size 0x00200000 64bit pref]
[ 66.613856] pcieport 0000:07:02.0: BAR 15: failed to assign [mem size 0x00200000 64bit pref]
[ 66.613859] pcieport 0000:07:02.0: BAR 13: no space for [io size 0x1000]
[ 66.613861] pcieport 0000:07:02.0: BAR 13: failed to assign [io size 0x1000]
[ 66.613864] pcieport 0000:07:01.0: BAR 13: no space for [io size 0x1000]
[ 66.613866] pcieport 0000:07:01.0: BAR 13: failed to assign [io size 0x1000]
[ 70.910819] pcieport 0000:00:1d.6: AER: Corrected error received: 0000:00:1d.6
[ 70.910833] pcieport 0000:00:1d.6: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 70.910842] pcieport 0000:00:1d.6: AER: device [8086:a11e] error status/mask=00002001/00002000
[ 70.910847] pcieport 0000:00:1d.6: AER: [ 0] RxErr
[ 70.984368] pcieport 0000:07:00.0: Refused to change power state, currently in D3
[ 70.986977] pci_bus 0000:08: busn_res: [bus 08] is released
[ 70.987115] pci_bus 0000:09: busn_res: [bus 09-3d] is released
[ 70.987219] pci_bus 0000:3e: busn_res: [bus 3e] is released
[ 70.987318] pci_bus 0000:07: busn_res: [bus 07-3e] is released
[ 70.910833] pcieport 0000:00:1d.6: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 70.910842] pcieport 0000:00:1d.6: AER: device [8086:a11e] error status/mask=00002001/00002000
[ 70.910847] pcieport 0000:00:1d.6: AER: [ 0] RxErr
有时我还会看到关于traps: udevd
Machine:
Dell XPS 15 - 9560 (Purchased in 2017)
Ubuntu 18.04
到目前为止我已经尝试过:
- 全新安装
- 全新安装其他版本/衍生版的 Ubuntu(19.04、19.10、20.04、衍生版 Pop OS)
- 更换 SSD
注意:有时它们会因内核崩溃而无法安装,而我不清楚原因(我在另一台机器上使用了相同的 USB 密钥)
更新
的结果lspci -tv
-[0000:00]-+-00.0 Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers
+-01.0-[01]----00.0 NVIDIA Corporation GP107M [GeForce GTX 1050 Mobile]
+-02.0 Intel Corporation HD Graphics 630
+-04.0 Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem
+-14.0 Intel Corporation 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller
+-14.2 Intel Corporation 100 Series/C230 Series Chipset Family Thermal Subsystem
+-15.0 Intel Corporation 100 Series/C230 Series Chipset Family Serial IO I2C Controller #0
+-15.1 Intel Corporation 100 Series/C230 Series Chipset Family Serial IO I2C Controller #1
+-16.0 Intel Corporation 100 Series/C230 Series Chipset Family MEI Controller #1
+-17.0 Intel Corporation HM170/QM170 Chipset SATA Controller [AHCI Mode]
+-1c.0-[02]----00.0 Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter
+-1c.1-[03]----00.0 Realtek Semiconductor Co., Ltd. RTS525A PCI Express Card Reader
+-1d.0-[04]----00.0 Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
+-1d.4-[05]--
+-1d.6-[06-3e]--
+-1f.0 Intel Corporation HM175 Chipset LPC/eSPI Controller
+-1f.2 Intel Corporation 100 Series/C230 Series Chipset Family Power Management Controller
+-1f.3 Intel Corporation CM238 HD Audio Controller
\-1f.4 Intel Corporation 100 Series/C230 Series Chipset Family SMBus
的结果sudo dmidecode -s bios-version
1.15.0
的结果lspci -nn
00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers [8086:5910] (rev 05)
00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 05)
00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 630 [8086:591b] (rev 04)
00:04.0 Signal processing controller [1180]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem [8086:1903] (rev 05)
00:14.0 USB controller [0c03]: Intel Corporation 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller [8086:a12f] (rev 31)
00:14.2 Signal processing controller [1180]: Intel Corporation 100 Series/C230 Series Chipset Family Thermal Subsystem [8086:a131] (rev 31)
00:15.0 Signal processing controller [1180]: Intel Corporation 100 Series/C230 Series Chipset Family Serial IO I2C Controller #0 [8086:a160] (rev 31)
00:15.1 Signal processing controller [1180]: Intel Corporation 100 Series/C230 Series Chipset Family Serial IO I2C Controller #1 [8086:a161] (rev 31)
00:16.0 Communication controller [0780]: Intel Corporation 100 Series/C230 Series Chipset Family MEI Controller #1 [8086:a13a] (rev 31)
00:17.0 SATA controller [0106]: Intel Corporation HM170/QM170 Chipset SATA Controller [AHCI Mode] [8086:a103] (rev 31)
00:1c.0 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #1 [8086:a110] (rev f1)
00:1c.1 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #2 [8086:a111] (rev f1)
00:1d.0 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #9 [8086:a118] (rev f1)
00:1d.4 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #13 [8086:a11c] (rev f1)
00:1d.6 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #15 [8086:a11e] (rev f1)
00:1f.0 ISA bridge [0601]: Intel Corporation HM175 Chipset LPC/eSPI Controller [8086:a152] (rev 31)
00:1f.2 Memory controller [0580]: Intel Corporation 100 Series/C230 Series Chipset Family Power Management Controller [8086:a121] (rev 31)
00:1f.3 Audio device [0403]: Intel Corporation CM238 HD Audio Controller [8086:a171] (rev 31)
00:1f.4 SMBus [0c05]: Intel Corporation 100 Series/C230 Series Chipset Family SMBus [8086:a123] (rev 31)
01:00.0 3D controller [0302]: NVIDIA Corporation GP107M [GeForce GTX 1050 Mobile] [10de:1c8d] (rev a1)
02:00.0 Network controller [0280]: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter [168c:003e] (rev 32)
03:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS525A PCI Express Card Reader [10ec:525a] (rev 01)
04:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 [144d:a808]
答案1
AER 错误
您在设备 8086:a11e(一个 PCIe 端口)上遇到了 AER(高级错误报告)可纠正错误。lspci -nn
并向lspci -tv
我们显示设备 1d.6 是可疑的,但奇怪的是,似乎没有任何东西与它连接。
到目前为止已尝试过的:
- 全新安装
- 全新安装其他版本/衍生版的 Ubuntu(19.04、19.10、20.04、衍生版 Pop OS)
- 更换 SSD
通常我会引用添加一个内核参数来消除 AER 噪声(我可能会稍后这样做),但在这种情况下,我认为我们需要检查一些其他的东西。
记忆
去https://www.memtest86.com/并下载/运行他们的免费记忆测试来测试你的记忆力。至少完成一次所有 4/4 测试以确认记忆力良好。这可能需要几个小时才能完成。
BIOS
笔记:升级 BIOS 前请备份重要文件
您当前拥有 BIOS 1.15.0。当前 BIOS 为 1.18.0,可从以下网址下载这里Ubuntu 的 BIOS 更新过程如下:这里。