“apt-grade” 后系统无法启动

“apt-grade” 后系统无法启动

首先,我的系统是:

-AMD Threadripper 1950X
-Vega FE*2 + Radeon VII
-ubuntu 18.04, kernel 4.18.0-16-generic

升级后,我发现系统无法启动。我进入 grub 并尝试删除 quiet splash 以查看日志,不幸的是它每次都冻结在不同点,所有行都显示绿色“OK”。然后,我按照一些在线指南将“nomodeset”添加到 grub,这样系统就可以正常启动。但是,它不会在内核中加载 GPU,我再也无法在 clinfo 中看到它们,也无法使用它们。

新内核 4.18.0-16 于 03-07 安装,并于 03-11 重启,没有任何问题,因此我认为这不是原因。我尝试使用 autoremove 删除 rocm,但问题仍然存在(删除后,仍然只有 nomodeset 允许系统启动)。

以下是 2019-03-13 安装的升级列表,以防万一有任何可疑之处,不幸的是它很长。

Commandline: apt upgrade
Requested-By: sandbo (1000)
Upgrade: hsa-rocr-dev:amd64 (1.1.9-49-g39f1af5, 1.1.9-55-gbac2a9b), 
libxcb-present-dev:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
hsakmt-roct-dev:amd64 (1.0.9-111-gc65f2de, 1.0.9-121-g876627e), 
libseccomp2:amd64 (2.3.1-2.1ubuntu4, 2.3.1-2.1ubuntu4.1), 
hsakmt-roct:amd64 (1.0.9-111-gc65f2de, 1.0.9-121-g876627e), 
virtinst:amd64 (1:1.5.1-0ubuntu1.1, 1:1.5.1-0ubuntu1.2), 
libxcb-xfixes0:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
rock-dkms:amd64 (2.1-96, 2.2-31), 
rocm-opencl:amd64 (1.2.0-2019020110, 1.2.0-2019030702), 
libsystemd0:amd64 (237-3ubuntu10.13, 237-3ubuntu10.15), 
libsystemd0:i386 (237-3ubuntu10.13, 237-3ubuntu10.15), 
hip_base:amd64 (1.5.19025, 1.5.19055), 
libxcb-present0:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb-present0:i386 (1.13-1, 1.13-2~ubuntu18.04), 
hsa-ext-rocr-dev:amd64 (1.1.9-49-g39f1af5, 1.1.9-55-gbac2a9b), 
libxcb-xfixes0-dev:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
rocrand:amd64 (1.8.2, 1.8.2), rocfft:amd64 (0.8.9.0, 0.9.0.0), 
google-chrome-stable:amd64 (72.0.3626.121-1, 73.0.3683.75-1), 
hcc:amd64 (1.3.19045, 1.3.19092), 
udev:amd64 (237-3ubuntu10.13, 237-3ubuntu10.15), 
libxcb-shm0:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb-shm0:i386 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb-randr0:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb-render0:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb-render0:i386 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb1-dev:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libudev1:amd64 (237-3ubuntu10.13, 237-3ubuntu10.15), 
libudev1:i386 (237-3ubuntu10.13, 237-3ubuntu10.15), 
comgr:amd64 (1.1.0, 1.1.0), 
libtiff5:amd64 (4.0.9-5ubuntu0.1, 4.0.9-5ubuntu0.2), 
libtiff5:i386 (4.0.9-5ubuntu0.1, 4.0.9-5ubuntu0.2), 
libxcb-randr0-dev:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb-dri3-dev:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb1:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb1:i386 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb-shape0-dev:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libnss-myhostname:amd64 (237-3ubuntu10.13, 237-3ubuntu10.15), 
libxcb-res0:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
rocm-libs:amd64 (2.1.96, 2.2.31), 
systemd-sysv:amd64 (237-3ubuntu10.13, 237-3ubuntu10.15), 
rocm-dev:amd64 (2.1.96, 2.2.31), 
libxcb-xv0:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
rocm-utils:amd64 (2.1.96, 2.2.31), 
libpam-systemd:amd64 (237-3ubuntu10.13, 237-3ubuntu10.15), 
libxcb-render0-dev:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb-shape0:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
virt-manager:amd64 (1:1.5.1-0ubuntu1.1, 1:1.5.1-0ubuntu1.2), 
systemd:amd64 (237-3ubuntu10.13, 237-3ubuntu10.15), 
hip_doc:amd64 (1.5.19025, 1.5.19055), 
libnss-systemd:amd64 (237-3ubuntu10.13, 237-3ubuntu10.15), 
miopen-hip:amd64 (1.7.1, 1.7.1), 
rocm-device-libs:amd64 (0.0.1, 0.0.1), 
hip_hcc:amd64 (1.5.19025, 1.5.19055), 
rocm-opencl-dev:amd64 (1.2.0-2019020110, 1.2.0-2019030702), 
hip_samples:amd64 (1.5.19025, 1.5.19055), 
libxcb-sync-dev:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
cxlactivitylogger:amd64 (5.6.7254, 5.6.7259), 
libxcb-dri2-0-dev:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb-glx0:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb-glx0:i386 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb-glx0-dev:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
rocprofiler-dev:amd64 (1.0.0, 1.0.0), 
libxcb-dri2-0:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb-dri2-0:i386 (1.13-1, 1.13-2~ubuntu18.04), 
rocm-smi:amd64 (1.0.0-100-g3cacdb9, 1.0.0-102-gdb444a9), 
libxcb-dri3-0:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb-dri3-0:i386 (1.13-1, 1.13-2~ubuntu18.04), 
rocm-dkms:amd64 (2.1.96, 2.2.31), 
libxcb-xkb1:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb-sync1:amd64 (1.13-1, 1.13-2~ubuntu18.04), 
libxcb-sync1:i386 (1.13-1, 1.13-2~ubuntu18.04)

尝试了一些方法:-将默认 dm 更改为 lightdm - 不起作用 -禁用 wayland - 不起作用

由于升级后重启后就会发生这种情况,我相信硬件没有问题。(几天前我还在正常使用它们,负载很高)

答案1

它已被修复,这是 ROCm 和 AMD GPU 的硬件特定问题。 https://github.com/RadeonOpenCompute/ROCm/issues/735#issuecomment-473100963

相关内容