系统滞后/冻结,dmesg 显示 GPU 错误和假 USB 断开连接 - 该怎么办?

系统滞后/冻结,dmesg 显示 GPU 错误和假 USB 断开连接 - 该怎么办?

我在配备基于 nVIDIA 的独立显卡的 Intel i5 3570K 机器上使用 GNU/Linux Mint 18.1 64 位。

最近,我家附近停电了。之后,我无法启动我的机器,不得不更换我的 PSU。现在我的机器可以启动并工作了。然而,我在日志中收到了奇怪的错误消息,涉及我的 GPU 以及一些 USB 设备(未断开连接)的假定断开连接和重新连接:

[  167.367247] NVRM: Xid (PCI:0000:02:00): 13, Graphics SM Warp Exception on (GPC 0, TPC 0): Out Of Range Address
[  167.367254] NVRM: Xid (PCI:0000:02:00): 13, Graphics SM Global Exception on (GPC 0, TPC 0): Physical Multiple Warp Errors
[  167.367260] NVRM: Xid (PCI:0000:02:00): 13, Graphics Exception: ESR 0x504648=0x15000e 0x504650=0x24 0x504644=0x13eff2 0x50464c=0x7f
[  167.367293] NVRM: Xid (PCI:0000:02:00): 13, Graphics SM Warp Exception on (GPC 1, TPC 0): Out Of Range Address
[  167.367296] NVRM: Xid (PCI:0000:02:00): 13, Graphics SM Global Exception on (GPC 1, TPC 0): Physical Multiple Warp Errors
[  167.367298] NVRM: Xid (PCI:0000:02:00): 13, Graphics Exception: ESR 0x50c648=0x3a000e 0x50c650=0x24 0x50c644=0x13eff2 0x50c64c=0x7f
[  167.367329] NVRM: Xid (PCI:0000:02:00): 13, Graphics SM Warp Exception on (GPC 2, TPC 0): Out Of Range Address
[  167.367332] NVRM: Xid (PCI:0000:02:00): 13, Graphics SM Global Exception on (GPC 2, TPC 0): Physical Multiple Warp Errors
[  167.367335] NVRM: Xid (PCI:0000:02:00): 13, Graphics Exception: ESR 0x514648=0x1e000e 0x514650=0x24 0x514644=0x13eff2 0x51464c=0x7f
[  167.367362] NVRM: Xid (PCI:0000:02:00): 13, Graphics Exception: ChID 0010, Class 0000a0c0, Offset 00001b0c, Data 00000000
[  167.709832] NVRM: GPU at PCI:0000:02:00: GPU-a503e5ff-3740-8318-878a-a21e528c646c
[  167.709836] NVRM: Xid (PCI:0000:02:00): 31, Ch 00000010, engmask 00000111, intr 10000000
[  168.045998] NVRM: GPU at PCI:0000:02:00: GPU-a503e5ff-3740-8318-878a-a21e528c646c
[  168.046003] NVRM: Xid (PCI:0000:02:00): 31, Ch 00000010, engmask 00000111, intr 10000000
[  168.407864] NVRM: GPU at PCI:0000:02:00: GPU-a503e5ff-3740-8318-878a-a21e528c646c
[  168.407869] NVRM: Xid (PCI:0000:02:00): 31, Ch 00000010, engmask 00000111, intr 10000000
[  168.752045] NVRM: GPU at PCI:0000:02:00: GPU-a503e5ff-3740-8318-878a-a21e528c646c
[  168.752049] NVRM: Xid (PCI:0000:02:00): 31, Ch 00000010, engmask 00000111, intr 10000000
[  169.110574] NVRM: GPU at PCI:0000:02:00: GPU-a503e5ff-3740-8318-878a-a21e528c646c
[  169.110578] NVRM: Xid (PCI:0000:02:00): 31, Ch 00000010, engmask 00000111, intr 10000000
[  169.479404] NVRM: GPU at PCI:0000:02:00: GPU-a503e5ff-3740-8318-878a-a21e528c646c
[  169.479408] NVRM: Xid (PCI:0000:02:00): 31, Ch 00000010, engmask 00000111, intr 10000000
[  169.819896] NVRM: GPU at PCI:0000:02:00: GPU-a503e5ff-3740-8318-878a-a21e528c646c
[  169.819900] NVRM: Xid (PCI:0000:02:00): 31, Ch 00000010, engmask 00000111, intr 10000000
[  529.780140] usb 2-1.6: USB disconnect, device number 4
[  530.008396] usb 2-1.6: new low-speed USB device number 7 using ehci-pci
[  530.105253] usb 2-1.6: New USB device found, idVendor=045e, idProduct=0084
[  530.105258] usb 2-1.6: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[  530.105260] usb 2-1.6: Product: Microsoft Basic Optical Mouse 
[  530.105263] usb 2-1.6: Manufacturer: Microsoft 
[  530.109030] input: Microsoft  Microsoft Basic Optical Mouse  as /devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.6/2-1.6:1.0/0003:045E:0084.0003/input/input21
[  530.109415] hid-generic 0003:045E:0084.0003: input,hidraw1: USB HID v1.11 Mouse [Microsoft  Microsoft Basic Optical Mouse ] on usb-0000:00:1d.0-1.6/input0
[  790.118073] NVRM: GPU at PCI:0000:02:00: GPU-a503e5ff-3740-8318-878a-a21e528c646c
[  790.118077] NVRM: Xid (PCI:0000:02:00): 31, Ch 00000018, engmask 00000101, intr 10000000
[  790.213339] NVRM: Xid (PCI:0000:02:00): 31, Ch 00000018, engmask 00000101, intr 10000000
[  790.308416] NVRM: Xid (PCI:0000:02:00): 31, Ch 00000018, engmask 00000101, intr 10000000
[  790.421164] NVRM: Xid (PCI:0000:02:00): 31, Ch 00000018, engmask 00000101, intr 10000000
[  790.521354] NVRM: Xid (PCI:0000:02:00): 31, Ch 00000018, engmask 00000101, intr 10000000
[  790.620486] NVRM: Xid (PCI:0000:02:00): 31, Ch 00000018, engmask 00000101, intr 10000000
[  790.712321] NVRM: Xid (PCI:0000:02:00): 31, Ch 00000018, engmask 00000101, intr 10000000
[  790.808216] NVRM: Xid (PCI:0000:02:00): 31, Ch 00000018, engmask 00000101, intr 10000000

每当我尝试对系统施加一些重大负载(例如从源代码构建源项目)时,一切都会冻结或严重滞后,通常包括 UI。这种情况过去并没有发生过。

另一方面,我最近也apt-get upgrade使用了新版本的 Linux 内核(无论发行版提供什么,而不是定制的)。

我的问题是:

  • 我所看到的一定是硬件问题吗?
  • 如果是这样,可能是什么问题?
  • 如果不是,操作系统或用户应用程序的哪些部分可能负责?

答案1

这可能是一个过时的问题,但是,如果有人遇到类似的问题,我建议检查 Nvidia 网站上的 Xid 错误消息,在这个链接中

例如,您可以看到错误号 13 和 31 与用户应用程序相关。这可能是您使用的软件错误地处理了内存的使用或访问。

相关内容