我有一台配备 GeForce RTX 2080 Ti 的台式机。
安装 Ubuntu 20.04 后,它会直接加载到一个空白屏幕上,并带有闪烁的光标。我使用 在系统上获取一个终端Ctrl + Alt + F1。
我首先尝试了下面提到的所有建议
- https://askubuntu.com/a/1286728-
sudo dpkg-reconfigure gdm3
其次是sudo service gdm3 restart
- https://askubuntu.com/a/1251028- 添加
nomodeset
到GRUB_CMDLINE_LINUX_DEFAULT
。尝试重新启动,但屏幕仍然空白。
我在一些论坛上看到显卡可能是导致问题的根本原因。所以我尝试在 NVidia 网站上检查兼容版本,并尝试运行ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
modalias : pci:v000010DEd00001E04sv000010DEsd000012AEbc03sc00i00
vendor : NVIDIA Corporation
model : TU102 [GeForce RTX 2080 Ti]
driver : nvidia-driver-465 - third-party non-free
driver : nvidia-driver-418-server - distro non-free
driver : nvidia-driver-450-server - distro non-free
driver : nvidia-driver-460-server - distro non-free
driver : nvidia-driver-460 - third-party non-free recommended
driver : xserver-xorg-video-nouveau - distro free builtin
我选择了nvidia-driver-460
。之后尝试重新启动,但仍然启动到相同的空白屏幕。我尝试运行 nvidia-smi,然后显示器出现了一些问题。屏幕上出现了多个紫色故障,然后返回No devices were found
我尝试按照这个答案清除并重新安装驱动程序 -https://askubuntu.com/a/1129890 论坛中类似场景中一些常用命令的输出已被突出显示
lspci-vvv
01:00.0 VGA compatible controller: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti] (rev a1) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 16
Region 0: Memory at a3000000 (32-bit, non-prefetchable) [size=16M]
Region 1: Memory at 90000000 (64-bit, prefetchable) [size=256M]
Region 3: Memory at a0000000 (64-bit, prefetchable) [size=32M]
Region 5: I/O ports at 3000 [size=128]
Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
消息 |删除 NVRM
[ 2.758029] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 460.84 Wed May 26 20:14:59 UTC 2021
[ 4.073016] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0xffff:1290)
[ 4.073070] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 4.507083] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0xffff:1290)
[ 4.507166] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 5.189855] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0xffff:1290)
[ 5.189875] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 5.623500] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0xffff:1290)
[ 5.623520] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 1165.397831] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0xffff:1290)
[ 1165.397902] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 1165.830240] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0xffff:1290)
[ 1165.830257] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 1179.773513] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0xffff:1290)
[ 1179.773531] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 1180.202302] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0xffff:1290)
[ 1180.202325] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 1416.634313] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0xffff:1290)
[ 1416.634352] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 1417.063089] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0xffff:1290)
[ 1417.063107] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
消息 | grep nvidia
[ 2.621823] nvidia: module license 'NVIDIA' taints kernel.
[ 2.758029] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 460.84 Wed May 26 20:14:59 UTC 2021
[ 2.774304] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 460.84 Wed May 26 20:01:59 UTC 2021
答案1
尝试添加pcie_aspm=off
并rcutree.rcu_idle_gp_delay=1
到内核命令行。
您也可以尝试从 Nvidia 网站安装最新的驱动程序。我知道这不是理想的选择,但它可能是一个有用的测试: https://www.nvidia.com/en-us/drivers/unix/
但是,这看起来像是硬件故障。看一下https://github.com/wilicc/gpu-burn和https://github.com/ComputationalRadiationPhysics/cuda_memtest测试你的 GPU 内存。