内核和命令行

内核和命令行

R9-290/290X Hawaii 系列卡不适用于 ubuntu 18 中的 Linux 内核 4.19.x 和 4.20.x。最后一个完全正常运行的内核版本是 4.18.20,它包含最新的稳定 mesa 驱动程序和内核中的 amdgpu drm 驱动程序。

4.19.x 和 4.20.x 导致启动失败或根本无法启动(grub 后黑屏,无 tty)。

根据 grub linux 命令行参数,我能够启动不稳定的桌面,以进一步调查和收集状态证据。这里是...

内核和命令行

核心:

Linux version 4.20.0-042000-generic (kernel@tangerine) (gcc version 8.2.0 (Ubuntu 8.2.0-12ubuntu1)) #201812232030 SMP Mon Dec 24 01:32:58 UTC 2018

内核命令行:

BOOT_IMAGE=/vmlinuz-4.20.0-042000-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash radeon.si_support=0 radeon.cik_support=0 amdgpu.si_support=1 amdgpu.cik_support=1 amdgpu.dc=1

lspci -v适用于 Linux 内核 4.20.0

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT / Grenada XT [Radeon R9 290X/390X] (prog-if 00 [VGA controller])
    Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT / Grenada XT [Radeon R9 290X/390X]
    Flags: fast devsel, IRQ 16
    Memory at d0000000 (64-bit, prefetchable) [size=256M]
    Memory at ef800000 (64-bit, prefetchable) [size=8M]
    I/O ports at ae00 [size=256]
    Memory at fb980000 (32-bit, non-prefetchable) [size=256K]
    [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
    Capabilities: [48] Vendor Specific Information: Len=08 <?>
    Capabilities: [50] Power Management version 3
    Capabilities: [58] Express Legacy Endpoint, MSI 00
    Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
    Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
    Capabilities: [150] Advanced Error Reporting
    Capabilities: [270] #19
    Capabilities: [2b0] Address Translation Service (ATS)
    Capabilities: [2c0] Page Request Interface (PRI)
    Capabilities: [2d0] Process Address Space ID (PASID)
    Kernel modules: radeon, amdgpu

01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii HDMI Audio [Radeon R9 290/290X / 390/390X]
    Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii HDMI Audio [Radeon R9 290/290X / 390/390X]
    Flags: bus master, fast devsel, latency 0, IRQ 32
    Memory at fb9fc000 (64-bit, non-prefetchable) [size=16K]
    Capabilities: [48] Vendor Specific Information: Len=08 <?>
    Capabilities: [50] Power Management version 3
    Capabilities: [58] Express Legacy Endpoint, MSI 00
    Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
    Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
    Capabilities: [150] Advanced Error Reporting
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel

最后启动内核 4.20 只有一个显示器工作。

其他显示强制镜像。其他 GPU 端口不工作。journalctl -b | grep drm输出:

[drm] amdgpu kernel modesetting enabled.
[drm] initializing kernel modesetting (HAWAII 0x1002:0x67B0 0x1002:0x0B00 0x00).
[drm] register mmio base: 0xFB980000
[drm] register mmio size: 262144
[drm] add ip block number 0 <cik_common>
[drm] add ip block number 1 <gmc_v7_0>
[drm] add ip block number 2 <cik_ih>
[drm] add ip block number 3 <gfx_v7_0>
[drm] add ip block number 4 <cik_sdma>
[drm] add ip block number 5 <powerplay>
[drm] add ip block number 6 <dm>
[drm] add ip block number 7 <uvd_v4_2>
[drm] add ip block number 8 <vce_v2_0>
[drm] vm size is 128 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
[drm:gmc_v7_0_sw_init [amdgpu]] *ERROR* Failed to load mc firmware!
[drm:amdgpu_device_init.cold.31 [amdgpu]] *ERROR* sw_init of IP block <gmc_v7_0> failed -2
[drm] amdgpu: finishing device.

最后一次成功启动 Linux 内核 4.18.20。

所有显示器均正常工作。一切正常。以下是journalctl | grep drm输出,供参考:

[drm] amdgpu kernel modesetting enabled.
fb: switching to amdgpudrmfb from VESA VGA
[drm] initializing kernel modesetting (HAWAII 0x1002:0x67B0 0x1002:0x0B00 0x00).
[drm] register mmio base: 0xFB980000
[drm] register mmio size: 262144
[drm] probing gen 2 caps for device 8086:151 = 261ac83/e
[drm] probing mlw for device 8086:151 = 261ac83
[drm] add ip block number 0 <cik_common>
[drm] add ip block number 1 <gmc_v7_0>
[drm] add ip block number 2 <cik_ih>
[drm] add ip block number 3 <ci_dpm>
[drm] add ip block number 4 <dm>
[drm] add ip block number 5 <gfx_v7_0>
[drm] add ip block number 6 <cik_sdma>
[drm] add ip block number 7 <uvd_v4_2>
[drm] add ip block number 8 <vce_v2_0>
[drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
[drm] Detected VRAM RAM=4096M, BAR=256M
[drm] RAM width 512bits GDDR5
[drm] amdgpu: 4096M of VRAM memory ready
[drm] amdgpu: 4096M of GTT memory ready.
[drm] GART: num cpu pages 262144, num gpu pages 262144
[drm] PCIE GART of 1024M enabled (table at 0x000000F4007E9000).
[drm] Internal thermal controller with fan control
[drm] Invalid PCC GPIO: 13!
[drm] amdgpu: dpm initialized
[drm] Found UVD firmware Version: 1.64 Family ID: 9
[drm] Found VCE firmware Version: 50.10 Binary ID: 2
[drm] PCIE gen 3 link speeds already enabled
[drm] dce110_link_encoder_construct: Failed to get encoder_cap_info from VBIOS with error code 4!
[drm] dce110_link_encoder_construct: Failed to get encoder_cap_info from VBIOS with error code 4!
[drm] dce110_link_encoder_construct: Failed to get encoder_cap_info from VBIOS with error code 4!
[drm] Display Core initialized with v3.1.44!
[drm] SADs count is: -524, don't need to read it
[drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[drm] Driver supports precise vblank timestamp query.
[drm] UVD initialized successfully.
[drm] VCE initialized successfully.
[drm] fb mappable at 0xD0BD0000
[drm] vram apper at 0xD0000000
[drm] size 8294400
[drm] fb depth is 24
[drm]    pitch is 7680
fbcon: amdgpudrmfb (fb0) is primary device
[drm] dce_get_required_clocks_state: clocks unsupported disp_clk 681000 pix_clk 148500
amdgpu 0000:01:00.0: fb0: amdgpudrmfb frame buffer device
[drm] Initialized amdgpu 3.26.0 20150101 for 0000:01:00.0 on minor 0

答案1

感谢 Alex Deucher(Linux 的 AMD 驱动程序开发人员),他帮助我排除故障并开始解决我自己的问题。

该问题和解决方法首先记录在该错误跟踪器中......https://bugs.freedesktop.org/show_bug.cgi?id=108781

下面我将详细介绍的解决方案不太可能在 Linux 内核 4.19.x 和 4.20.x 中得到修复。我希望它能在未来的内核中得到解决。如果您想要简单的东西,请坚持使用 4.18.20 或更低版本。如果您想利用 4.19.x/4.20.x 内核中的任何修复,那么您可以尝试下面的方法,这对我有用……

解决方法:

  1. 从 Linux 命令行中完全删除了 amdgpu.dpm=x 并更新了 grub。‘0’ 或 ‘1’ 将不起作用,无法启动,甚至无法启动 tty
  2. 将 /lib/firmware/radeon/* 复制到 /lib/firmware/amdgpu/
  3. 备份 /lib/firmware/radeon/* 的所有内容
  4. 已删除 /lib/firmware/radeon/
  5. 确保 4.20.0 的 initrd 位于 /boot 位置
  6. ~$ sudo update-initramfs -u
  7. 通过确认功能/工作内核的内容, lsinitramfs /boot/initrd.img-<YOUR-KERNEL>-generic | grep hawaii即使我们已经删除它,它仍然需要指向 /lib/firmware/radeon。
  8. 确认无法运行的新内核的内容。对我来说,内核是lsinitramfs /boot/initrd.img-4.20.0-042000-generic | grep hawaii。它应该只包含 /lib/firmware/amdgpu/*
  9. 从备份中恢复 /lib/firmware/radeon/*。这样您就可以在必要时恢复到以前的内核版本。
  10. 重启/重新启动
  11. [可选-重要] 如果一切正常(对我来说是这样的),那么为了避免与未来的内核发生冲突,请删除 /lib/firmware/radeon,然后删除现在正在运行的新内核之前的所有旧内核。如果您不这样做并安装新内核,然后运行命令 update-initramfs,那么您将在 initrd 中为未来的内核获得重复的路径。不确定发生这种情况时会发生什么,我没有测试以找出原因,因为没有时间。

相关内容