GPU 随机崩溃

GPU 随机崩溃

本周我的 gpu 随机崩溃,起初我使用的是 PopOS,然后它开始崩溃,然后我换到了 Manjaro - 崩溃,现在我使用的是 Ubuntu,它仍然是一样 - 崩溃。我注意到崩溃的一件事是我的 vega 56 有电源状态指示器,每个指示器代表一个状态,当我在玩英雄联盟时,它总是停留在第一个状态,然后它崩溃了,我的 gpu 电源状态指示器显示 gpu 进入了第 8 个电源状态。当我的 gpu 崩溃时(我也遇到了这样的问题https://i.stack.imgur.com/89mVH.jpg)当我使用 PopOS 时(我使用的是 KDE plasma),我在日志中发现了以下内容:

Jan 11 11:53:23 pop-os kernel: [11220.052278] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32777, for process plasmashell pid 2291 thread plasmashel:cs0 pid 2641)
Jan 11 11:53:23 pop-os kernel: [11220.052282] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x00000001059cb000 from client 27
Jan 11 11:53:23 pop-os kernel: [11220.052283] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00401031
Jan 11 11:53:23 pop-os kernel: [11220.052284] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x8
Jan 11 11:53:23 pop-os kernel: [11220.052285] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x1
Jan 11 11:53:23 pop-os kernel: [11220.052286] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052287] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x3
Jan 11 11:53:23 pop-os kernel: [11220.052288] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052289] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052295] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32777, for process plasmashell pid 2291 thread plasmashel:cs0 pid 2641)
Jan 11 11:53:23 pop-os kernel: [11220.052296] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x00000001059d1000 from client 27
Jan 11 11:53:23 pop-os kernel: [11220.052297] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 11 11:53:23 pop-os kernel: [11220.052298] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052299] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052300] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052301] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052302] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052303] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052308] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32777, for process plasmashell pid 2291 thread plasmashel:cs0 pid 2641)
Jan 11 11:53:23 pop-os kernel: [11220.052309] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x00000001059c2000 from client 27
Jan 11 11:53:23 pop-os kernel: [11220.052310] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 11 11:53:23 pop-os kernel: [11220.052311] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052312] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052312] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052313] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052314] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052315] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052320] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32777, for process plasmashell pid 2291 thread plasmashel:cs0 pid 2641)
Jan 11 11:53:23 pop-os kernel: [11220.052321] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x00000001059c0000 from client 27
Jan 11 11:53:23 pop-os kernel: [11220.052322] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 11 11:53:23 pop-os kernel: [11220.052323] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052328] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052329] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052330] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052331] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052332] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052338] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32777, for process plasmashell pid 2291 thread plasmashel:cs0 pid 2641)
Jan 11 11:53:23 pop-os kernel: [11220.052339] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x00000001059c1000 from client 27
Jan 11 11:53:23 pop-os kernel: [11220.052340] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00601031
Jan 11 11:53:23 pop-os kernel: [11220.052340] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x8
Jan 11 11:53:23 pop-os kernel: [11220.052341] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x1
Jan 11 11:53:23 pop-os kernel: [11220.052342] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052343] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x3
Jan 11 11:53:23 pop-os kernel: [11220.052344] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052345] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052351] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32777, for process plasmashell pid 2291 thread plasmashel:cs0 pid 2641)
Jan 11 11:53:23 pop-os kernel: [11220.052352] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x00000001059cc000 from client 27
Jan 11 11:53:23 pop-os kernel: [11220.052353] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 11 11:53:23 pop-os kernel: [11220.052354] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052355] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052356] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052357] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052357] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052358] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052376] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32777, for process plasmashell pid 2291 thread plasmashel:cs0 pid 2641)
Jan 11 11:53:23 pop-os kernel: [11220.052377] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x00000001059c5000 from client 27
Jan 11 11:53:23 pop-os kernel: [11220.052379] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00601031
Jan 11 11:53:23 pop-os kernel: [11220.052380] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x8
Jan 11 11:53:23 pop-os kernel: [11220.052381] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x1
Jan 11 11:53:23 pop-os kernel: [11220.052382] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052384] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x3
Jan 11 11:53:23 pop-os kernel: [11220.052385] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:23 pop-os kernel: [11220.052386] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442256] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32788, for process chromium pid 9297 thread chromium:cs0 pid 9303)
Jan 11 11:53:30 pop-os kernel: [11226.442259] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000000116f7c000 from client 27
Jan 11 11:53:30 pop-os kernel: [11226.442260] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00601031
Jan 11 11:53:30 pop-os kernel: [11226.442261] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x8
Jan 11 11:53:30 pop-os kernel: [11226.442262] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x1
Jan 11 11:53:30 pop-os kernel: [11226.442263] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442264] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x3
Jan 11 11:53:30 pop-os kernel: [11226.442265] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442266] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442273] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32788, for process chromium pid 9297 thread chromium:cs0 pid 9303)
Jan 11 11:53:30 pop-os kernel: [11226.442274] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000000116f8b000 from client 27
Jan 11 11:53:30 pop-os kernel: [11226.442275] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 11 11:53:30 pop-os kernel: [11226.442276] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442277] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442283] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442283] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442284] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442285] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442291] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32788, for process chromium pid 9297 thread chromium:cs0 pid 9303)
Jan 11 11:53:30 pop-os kernel: [11226.442292] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000000116f89000 from client 27
Jan 11 11:53:30 pop-os kernel: [11226.442293] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Jan 11 11:53:30 pop-os kernel: [11226.442294] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x8
Jan 11 11:53:30 pop-os kernel: [11226.442295] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x1
Jan 11 11:53:30 pop-os kernel: [11226.442296] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442297] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x3
Jan 11 11:53:30 pop-os kernel: [11226.442298] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442299] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442304] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32788, for process chromium pid 9297 thread chromium:cs0 pid 9303)
Jan 11 11:53:30 pop-os kernel: [11226.442308] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000000116f68000 from client 27
Jan 11 11:53:30 pop-os kernel: [11226.442309] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 11 11:53:30 pop-os kernel: [11226.442310] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442311] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442312] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442313] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442314] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442314] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442320] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32788, for process chromium pid 9297 thread chromium:cs0 pid 9303)
Jan 11 11:53:30 pop-os kernel: [11226.442321] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000000116f3e000 from client 27
Jan 11 11:53:30 pop-os kernel: [11226.442322] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Jan 11 11:53:30 pop-os kernel: [11226.442323] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x8
Jan 11 11:53:30 pop-os kernel: [11226.442324] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x1
Jan 11 11:53:30 pop-os kernel: [11226.442325] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442326] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x3
Jan 11 11:53:30 pop-os kernel: [11226.442326] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442327] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442332] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32788, for process chromium pid 9297 thread chromium:cs0 pid 9303)
Jan 11 11:53:30 pop-os kernel: [11226.442334] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000000116faa000 from client 27
Jan 11 11:53:30 pop-os kernel: [11226.442334] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 11 11:53:30 pop-os kernel: [11226.442335] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442336] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442337] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442338] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442339] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442340] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442345] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32788, for process chromium pid 9297 thread chromium:cs0 pid 9303)
Jan 11 11:53:30 pop-os kernel: [11226.442346] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000000116f81000 from client 27
Jan 11 11:53:30 pop-os kernel: [11226.442347] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 11 11:53:30 pop-os kernel: [11226.442348] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442349] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442353] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442354] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442355] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442355] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442361] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32788, for process chromium pid 9297 thread chromium:cs0 pid 9303)
Jan 11 11:53:30 pop-os kernel: [11226.442362] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000000116f74000 from client 27
Jan 11 11:53:30 pop-os kernel: [11226.442363] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Jan 11 11:53:30 pop-os kernel: [11226.442364] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x8
Jan 11 11:53:30 pop-os kernel: [11226.442365] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x1
Jan 11 11:53:30 pop-os kernel: [11226.442366] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442366] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x3
Jan 11 11:53:30 pop-os kernel: [11226.442367] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442368] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442373] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32788, for process chromium pid 9297 thread chromium:cs0 pid 9303)
Jan 11 11:53:30 pop-os kernel: [11226.442374] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000000116f79000 from client 27
Jan 11 11:53:30 pop-os kernel: [11226.442375] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 11 11:53:30 pop-os kernel: [11226.442376] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442377] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442378] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442379] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442380] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442381] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442388] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32788, for process chromium pid 9297 thread chromium:cs0 pid 9303)
Jan 11 11:53:30 pop-os kernel: [11226.442389] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000000116f2a000 from client 27
Jan 11 11:53:30 pop-os kernel: [11226.442390] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Jan 11 11:53:30 pop-os kernel: [11226.442391] amdgpu 0000:03:00.0: amdgpu:       Faulty UTCL2 client ID: 0x8
Jan 11 11:53:30 pop-os kernel: [11226.442392] amdgpu 0000:03:00.0: amdgpu:       MORE_FAULTS: 0x1
Jan 11 11:53:30 pop-os kernel: [11226.442393] amdgpu 0000:03:00.0: amdgpu:       WALKER_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442394] amdgpu 0000:03:00.0: amdgpu:       PERMISSION_FAULTS: 0x3
Jan 11 11:53:30 pop-os kernel: [11226.442395] amdgpu 0000:03:00.0: amdgpu:       MAPPING_ERROR: 0x0
Jan 11 11:53:30 pop-os kernel: [11226.442396] amdgpu 0000:03:00.0: amdgpu:       RW: 0x0

然后我安装了 Ubuntu 20.04LTS(我使用的是 Cinnamon),今天是我第一次遇到崩溃。所以我决定查看日志,这是崩溃前的最后日志:

Jan 12 21:00:51 mk kernel: [24796.635550] kauditd_printk_skb: 187 callbacks suppressed
Jan 12 21:00:51 mk kernel: [24796.635552] audit: type=1400 audit(1610478051.000:397772): apparmor="DENIED" operation="ptrace" profile="snap.discord.discord" pid=21518 comm="Discord" requested_mask="read" denied_mask="read" peer="unconfined"
Jan 12 21:00:51 mk kernel: [24796.635584] audit: type=1400 audit(1610478051.000:397773): apparmor="DENIED" operation="ptrace" profile="snap.discord.discord" pid=21518 comm="Discord" requested_mask="read" denied_mask="read" peer="unconfined"
Jan 12 21:00:51 mk kernel: [24796.635615] audit: type=1400 audit(1610478051.000:397774): apparmor="DENIED" operation="ptrace" profile="snap.discord.discord" pid=21518 comm="Discord" requested_mask="read" denied_mask="read" peer="unconfined"
Jan 12 21:00:51 mk kernel: [24796.635644] audit: type=1400 audit(1610478051.000:397775): apparmor="DENIED" operation="ptrace" profile="snap.discord.discord" pid=21518 comm="Discord" requested_mask="read" denied_mask="read" peer="unconfined"
Jan 12 21:00:51 mk kernel: [24796.635673] audit: type=1400 audit(1610478051.000:397776): apparmor="DENIED" operation="ptrace" profile="snap.discord.discord" pid=21518 comm="Discord" requested_mask="read" denied_mask="read" peer="unconfined"
Jan 12 21:00:51 mk kernel: [24796.635702] audit: type=1400 audit(1610478051.000:397777): apparmor="DENIED" operation="ptrace" profile="snap.discord.discord" pid=21518 comm="Discord" requested_mask="read" denied_mask="read" peer="unconfined"
Jan 12 21:00:51 mk kernel: [24796.635731] audit: type=1400 audit(1610478051.000:397778): apparmor="DENIED" operation="ptrace" profile="snap.discord.discord" pid=21518 comm="Discord" requested_mask="read" denied_mask="read" peer="unconfined"
Jan 12 21:00:51 mk kernel: [24796.635760] audit: type=1400 audit(1610478051.000:397779): apparmor="DENIED" operation="ptrace" profile="snap.discord.discord" pid=21518 comm="Discord" requested_mask="read" denied_mask="read" peer="unconfined"
Jan 12 21:00:51 mk kernel: [24796.635789] audit: type=1400 audit(1610478051.000:397780): apparmor="DENIED" operation="ptrace" profile="snap.discord.discord" pid=21518 comm="Discord" requested_mask="read" denied_mask="read" peer="unconfined"
Jan 12 21:00:51 mk kernel: [24796.635818] audit: type=1400 audit(1610478051.000:397781): apparmor="DENIED" operation="ptrace" profile="snap.discord.discord" pid=21518 comm="Discord" requested_mask="read" denied_mask="read" peer="unconfined"
Jan 12 21:01:01 mk kernel: [24806.635618] kauditd_printk_skb: 156 callbacks suppressed
Jan 12 21:01:01 mk kernel: [24806.635619] audit: type=1400 audit(1610478061.000:397938): apparmor="DENIED" operation="ptrace" profile="snap.discord.discord" pid=21518 comm="Discord" requested_mask="read" denied_mask="read" peer="unconfined"
Jan 12 21:01:01 mk kernel: [24806.636813] audit: audit_backlog=65 > audit_backlog_limit=64
Jan 12 21:01:01 mk kernel: [24806.636814] audit: audit_lost=10834 audit_rate_limit=0 audit_backlog_limit=64
Jan 12 21:01:01 mk kernel: [24806.636815] audit: backlog limit exceeded
Jan 12 21:01:01 mk kernel: [24806.636833] audit: audit_backlog=65 > audit_backlog_limit=64
Jan 12 21:01:01 mk kernel: [24806.636834] audit: audit_lost=10835 audit_rate_limit=0 audit_backlog_limit=64
Jan 12 21:01:01 mk kernel: [24806.636834] audit: backlog limit exceeded
Jan 12 21:01:01 mk kernel: [24806.636854] audit: audit_backlog=65 > audit_backlog_limit=64
Jan 12 21:01:01 mk kernel: [24806.636855] audit: audit_lost=10836 audit_rate_limit=0 audit_backlog_limit=64
Jan 12 21:01:01 mk kernel: [24806.636855] audit: backlog limit exceeded

这是 Linux 和 PC 规格:

vega@mk:~$ neofetch 
vega@mk 

OS: Ubuntu 20.04.1 LTS x86_64 
Kernel: 5.8.0-36-generic 
Uptime: 1 hour, 1 min 
Packages: 2612 (dpkg), 8 (snap) 
Shell: bash 5.0.17 
Resolution: 2560x1440 
DE: Cinnamon 
WM: Mutter (Muffin) 
WM Theme: Adapta-Nokto (Adapta-Nokto) 
Theme: Adapta-Nokto [GTK2/3] 
Icons: LoginIcons [GTK2/3] 
Terminal: konsole 
CPU: Intel i5-3450 (4) @ 3.500GHz 
GPU: AMD ATI Radeon RX Vega 56/64 
Memory: 6198MiB / 15961MiB                                               

我还做了快速压力测试,结果完全正常。

vega@mk:~$ glmark2 
=======================================================
    glmark2 2014.03+git20150611.fa71af2d
=======================================================
    OpenGL Information
    GL_VENDOR:     X.Org
    GL_RENDERER:   Radeon RX Vega (VEGA10, DRM 3.38.0, 5.8.0-36-generic, LLVM 11.0.0)
    GL_VERSION:    4.6 (Compatibility Profile) Mesa 20.2.6
=======================================================
[build] use-vbo=false:
 FPS: 4136 FrameTime: 0.242 ms
[build] use-vbo=true: FPS: 5782 FrameTime: 0.173 ms
[texture] texture-filter=nearest: FPS: 5805 FrameTime: 0.172 ms
[texture] texture-filter=linear: FPS: 5880 FrameTime: 0.170 ms
[texture] texture-filter=mipmap: FPS: 5955 FrameTime: 0.168 ms
[shading] shading=gouraud: FPS: 5993 FrameTime: 0.167 ms
[shading] shading=blinn-phong-inf: FPS: 6120 FrameTime: 0.163 ms
[shading] shading=phong: FPS: 5884 FrameTime: 0.170 ms
[shading] shading=cel: FPS: 5929 FrameTime: 0.169 ms
[bump] bump-render=high-poly: FPS: 6105 FrameTime: 0.164 ms
[bump] bump-render=normals: FPS: 5822 FrameTime: 0.172 ms
[bump] bump-render=height: FPS: 5890 FrameTime: 0.170 ms
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 5967 FrameTime: 0.168 ms
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 6346 FrameTime: 0.158 ms
[pulsar] light=false:quads=5:texture=false: FPS: 5397 FrameTime: 0.185 ms
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 3999 FrameTime: 0.250 ms
[desktop] effect=shadow:windows=4: FPS: 3995 FrameTime: 0.250 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 715 FrameTime: 1.399 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 904 FrameTime: 1.106 ms
[buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 772 FrameTime: 1.295 ms
[ideas] speed=duration: FPS: 2640 FrameTime: 0.379 ms
[jellyfish] <default>: FPS: 5266 FrameTime: 0.190 ms
[terrain] <default>: FPS: 1935 FrameTime: 0.517 ms
[shadow] <default>: FPS: 4722 FrameTime: 0.212 ms
[refract] <default>: FPS: 4051 FrameTime: 0.247 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 6069 FrameTime: 0.165 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 5996 FrameTime: 0.167 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 6211 FrameTime: 0.161 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 6077 FrameTime: 0.165 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 6268 FrameTime: 0.160 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 6119 FrameTime: 0.163 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 6091 FrameTime: 0.164 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 6074 FrameTime: 0.165 ms
=======================================================
                                  glmark2 Score: 4997 
=======================================================

有人知道为什么会发生这种情况吗?我该如何解决?非常感谢!

相关内容