Ubuntu 16 内核 BUG(“Oops:0000 [#1] SMP”)与 amdgpu 相关

Ubuntu 16 内核 BUG(“Oops:0000 [#1] SMP”)与 amdgpu 相关

我最近在我的个人笔记本电脑(Dell Inspiron 5548)上安装了 Ubuntu 16.04,每次我尝试从任何用户帐户注销时,都会收到与 amdgpu 相关的内核 BUG(“Oops”消息)。

我尝试从 14 LTS 升级,并使用 USB 驱动器上的 16.04 映像进行完整安装。我也在计算机上使用 Ubuntu 15 很长一段时间,没有遇到任何类似的问题。

我读到过,这可能与新的 amd 显卡取代 fglrx 有关,并且只支持全新的显卡。但是,我在 Dell Studio 1458 上安装了 Ubuntu 16(现在不记得显卡了,但它也是 Radeon),它运行得很好。

下面是 BUG 报告。有人知道该如何修复吗?

编辑:我的显卡是 AMD Radeon™ HD R7 M265。

May  3 10:57:58 ubuntu-5548 kernel: [  329.916153] [drm] PCIE GART of 2048M enabled (table at 0x0000000000040000).
May  3 10:57:58 ubuntu-5548 kernel: [  330.155529] [drm] ring test on 0 succeeded in 15 usecs
May  3 10:57:58 ubuntu-5548 kernel: [  330.155722] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: cp failed to lock ring 1 (-2).
May  3 10:57:58 ubuntu-5548 kernel: [  330.155736] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: cp failed to lock ring 2 (-2).
May  3 10:57:58 ubuntu-5548 kernel: [  330.155747] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: cp failed to lock ring 3 (-2).
May  3 10:57:58 ubuntu-5548 kernel: [  330.155757] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: cp failed to lock ring 4 (-2).
May  3 10:57:58 ubuntu-5548 kernel: [  330.155766] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: cp failed to lock ring 5 (-2).
May  3 10:57:58 ubuntu-5548 kernel: [  330.155775] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: cp failed to lock ring 6 (-2).
May  3 10:57:58 ubuntu-5548 kernel: [  330.155784] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: cp failed to lock ring 7 (-2).
May  3 10:57:58 ubuntu-5548 kernel: [  330.155793] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: cp failed to lock ring 8 (-2).
May  3 10:57:58 ubuntu-5548 kernel: [  330.155821] [drm] ring test on 9 succeeded in 7 usecs
May  3 10:57:58 ubuntu-5548 kernel: [  330.155837] [drm:sdma_v2_4_ring_test_ring [amdgpu]] *ERROR* amdgpu: dma failed to lock ring 10 (-2).
May  3 10:57:58 ubuntu-5548 kernel: [  330.155844] [drm:amdgpu_resume [amdgpu]] *ERROR* resume 5 failed -2
May  3 10:57:58 ubuntu-5548 kernel: [  330.155852] [drm:amdgpu_resume_kms [amdgpu]] *ERROR* amdgpu_resume failed (-2).
May  3 10:57:58 ubuntu-5548 acpid: client 984[0:0] has disconnected
May  3 10:57:58 ubuntu-5548 acpid: client connected from 3312[0:0]
May  3 10:57:58 ubuntu-5548 acpid: 1 client rule loaded
May  3 10:57:59 ubuntu-5548 kernel: [  330.329604] BUG: unable to handle kernel NULL pointer dereference at 0000000000000248
May  3 10:57:59 ubuntu-5548 kernel: [  330.329631] IP: [<ffffffffc0348ea2>] amdgpu_vm_grab_id+0x122/0x310 [amdgpu]
May  3 10:57:59 ubuntu-5548 kernel: [  330.329667] PGD 0 
May  3 10:57:59 ubuntu-5548 kernel: [  330.329674] Oops: 0000 [#1] SMP 
May  3 10:57:59 ubuntu-5548 kernel: [  330.329686] Modules linked in: drbg ansi_cprng ctr ccm rfcomm bnep rtsx_usb_sdmmc rtsx_usb_ms memstick rtsx_usb nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp arc4 coretemp kvm_intel kvm iwlmvm mac80211 irqbypass crct10dif_pclmul dell_wmi crc32_pclmul dell_laptop sparse_keymap dcdbas snd_hda_codec_hdmi dell_smm_hwmon aesni_intel aes_x86_64 lrw gf128mul glue_helper iwlwifi ablk_helper cryptd dell_led btusb btrtl btbcm input_leds btintel serio_raw snd_hda_codec_realtek bluetooth hid_multitouch snd_hda_codec_generic joydev cfg80211 snd_soc_rt5640 snd_hda_intel snd_soc_ssm4567 snd_soc_rl6231 snd_hda_codec lpc_ich elan_i2c snd_soc_core snd_hda_core snd_hwdep snd_compress ac97_bus snd_pcm_dmaengine snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd dw_dmac dw_dmac_core soundcore dell_rbtn snd_soc_sst_acpi shpchp mei_me i2c_designware_platform 8250_dw mei spi_pxa2xx_platform i2c_designware_core acpi_pad mac_hid uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core v4l2_common videodev media parport_pc ppdev lp parport autofs4 hid_generic usbhid amdkfd amd_iommu_v2 amdgpu i915 ttm psmouse i2c_algo_bit drm_kms_helper syscopyarea ahci sysfillrect libahci sysimgblt fb_sys_fops r8169 drm mii wmi video i2c_hid hid sdhci_acpi sdhci fjes
May  3 10:57:59 ubuntu-5548 kernel: [  330.330101] CPU: 0 PID: 163 Comm: gfx Not tainted 4.4.0-21-generic #37-Ubuntu
May  3 10:57:59 ubuntu-5548 kernel: [  330.330120] Hardware name: Dell Inc. Inspiron 5548/0YDTG3, BIOS A04 05/15/2015
May  3 10:57:59 ubuntu-5548 kernel: [  330.330140] task: ffff8804460e44c0 ti: ffff8804448dc000 task.ti: ffff8804448dc000
May  3 10:57:59 ubuntu-5548 kernel: [  330.330160] RIP: 0010:[<ffffffffc0348ea2>]  [<ffffffffc0348ea2>] amdgpu_vm_grab_id+0x122/0x310 [amdgpu]
May  3 10:57:59 ubuntu-5548 kernel: [  330.330197] RSP: 0018:ffff8804448dfce0  EFLAGS: 00010246
May  3 10:57:59 ubuntu-5548 kernel: [  330.330211] RAX: 0000000000000000 RBX: ffff880445530000 RCX: ffff88008eefd400
May  3 10:57:59 ubuntu-5548 kernel: [  330.330230] RDX: ffffffff81ef3cc0 RSI: ffff880445532d78 RDI: ffff880449cc3000
May  3 10:57:59 ubuntu-5548 kernel: [  330.330249] RBP: ffff8804448dfdb0 R08: ffff88008eefd400 R09: 000000018080004d
May  3 10:57:59 ubuntu-5548 kernel: [  330.330268] R10: ffff8803f30ee020 R11: 0000000000000004 R12: ffff880445532d78
May  3 10:57:59 ubuntu-5548 kernel: [  330.330286] R13: ffff880449cc3000 R14: ffff880445530838 R15: 0000000000000001
May  3 10:57:59 ubuntu-5548 kernel: [  330.330305] FS:  0000000000000000(0000) GS:ffff88045ec00000(0000) knlGS:0000000000000000
May  3 10:57:59 ubuntu-5548 kernel: [  330.330327] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May  3 10:57:59 ubuntu-5548 kernel: [  330.330342] CR2: 0000000000000248 CR3: 0000000002e0a000 CR4: 00000000003406f0
May  3 10:57:59 ubuntu-5548 kernel: [  330.330361] Stack:
May  3 10:57:59 ubuntu-5548 kernel: [  330.330367]  ffff88042a99dc48 0000000000000000 ffff88008eefd400 0000000000000000
May  3 10:57:59 ubuntu-5548 kernel: [  330.330390]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
May  3 10:57:59 ubuntu-5548 kernel: [  330.330412]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
May  3 10:57:59 ubuntu-5548 kernel: [  330.330435] Call Trace:
May  3 10:57:59 ubuntu-5548 kernel: [  330.330454]  [<ffffffffc034ace0>] amdgpu_ib_schedule+0x90/0x390 [amdgpu]
May  3 10:57:59 ubuntu-5548 kernel: [  330.330487]  [<ffffffffc03873b6>] amdgpu_sched_run_job+0x36/0x140 [amdgpu]
May  3 10:57:59 ubuntu-5548 kernel: [  330.330519]  [<ffffffffc0386bcf>] amd_sched_main+0x23f/0x400 [amdgpu]
May  3 10:57:59 ubuntu-5548 kernel: [  330.330538]  [<ffffffff810c3a10>] ? wake_atomic_t_function+0x60/0x60
May  3 10:57:59 ubuntu-5548 kernel: [  330.330567]  [<ffffffffc0386990>] ? amd_sched_entity_wakeup+0x70/0x70 [amdgpu]
May  3 10:57:59 ubuntu-5548 kernel: [  330.330587]  [<ffffffff810a0528>] kthread+0xd8/0xf0
May  3 10:57:59 ubuntu-5548 kernel: [  330.330601]  [<ffffffff810a0450>] ? kthread_create_on_node+0x1e0/0x1e0
May  3 10:57:59 ubuntu-5548 kernel: [  330.330621]  [<ffffffff8182488f>] ret_from_fork+0x3f/0x70
May  3 10:57:59 ubuntu-5548 kernel: [  330.330636]  [<ffffffff810a0450>] ? kthread_create_on_node+0x1e0/0x1e0
May  3 10:57:59 ubuntu-5548 kernel: [  330.330653] Code: c0 44 89 bc 85 48 ff ff ff 41 83 c7 01 44 39 bb 1c 09 00 00 76 4f 49 83 c6 10 4d 8b 6e f0 4d 85 ed 74 66 4c 89 ef e8 fe 2e ff ff <8b> b8 48 02 00 00 48 8b b4 fd 50 ff ff ff 48 85 f6 74 b2 41 8b 
May  3 10:57:59 ubuntu-5548 kernel: [  330.330757] RIP  [<ffffffffc0348ea2>] amdgpu_vm_grab_id+0x122/0x310 [amdgpu]
May  3 10:57:59 ubuntu-5548 kernel: [  330.330788]  RSP <ffff8804448dfce0>
May  3 10:57:59 ubuntu-5548 kernel: [  330.330797] CR2: 0000000000000248
May  3 10:57:59 ubuntu-5548 kernel: [  330.337194] ---[ end trace f4393c5763eacaf5 ]---

答案1

您的症状与此处报告的错误相同:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1579374

如果您想引起维护人员的更多关注,您可以登录启动板并单击上面错误中的“它也影响我”链接。

在 Ubuntu 内核包中修复该错误之前,请尝试使用 Renê Barbosa 建议的 4.6 上游内核包。您可以在此处下载:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.6-rc7-wily/

请下载并安装以下文件:

  • Linux 头文件
  • Linux 头文件
  • Linux 映像-4.6.0-040600rc7-generic_4.6.0-040600rc7.201605081830_amd64.deb

(抱歉,由于 askubuntu 声誉限制,我无法发布文件的直接链接)

答案2

同样的问题。使用相同的显卡。

编辑:我通过从主线存储库安装最新的 4.6 内核包修复了这个问题。看起来需要将某些内容反向移植到 Ubuntu 16.04 的默认 4.4 内核。

答案3

我看到您升级了内核并解决了您的问题。我有相同的规格,并且升级到了内核 4.6,现在 DRI_PRIME=1 glxgears 工作正常。

... 但使用 DRI_PRIME=1 命令运行游戏给我带来的结果与使用集成卡运行的结果相同(较差)。这是正常的还是我遗漏了什么?

相关内容