无法加载 NVIDIA 显卡驱动程序

无法加载 NVIDIA 显卡驱动程序

将 Debian 11 升级到 12 并升级内核版本 6.5.0-1 后,我发现我的 NVIDIA 驱动程序不知为何损坏了。使用(sudo) nvidia-settings,我得到了以下输出。

caeleste@spectre:~$ nvidia-settings 

ERROR: NVIDIA driver is not loaded


(nvidia-settings:35073): GLib-GObject-CRITICAL **: 10:59:12.413: g_object_unref: assertion 'G_IS_OBJECT (object)' failed

** (nvidia-settings:35073): CRITICAL **: 10:59:12.414: ctk_powermode_new: assertion '(ctrl_target != NULL) && (ctrl_target->h != NULL)' failed

由于不理解这个问题的具体原因,我尝试通过清除nvidia*和安装进行全新重新安装linux-headers-amd64 linux-image-amd64 nvidia-detect nvidia-driver nvidia-cuda-dev。但我得到了完全相同的错误。

一些系统信息:

$> neofetch

OS: Debian GNU/Linux 12 (bookworm) x86_64 
Kernel: 6.5.0-1-amd64 
Packages: 2995 (dpkg)
Shell: bash 5.2.15
Resolution: 3840x2160
DE: Plasma 5.27.5
WM: KWin
Terminal: caelestis-custom
CPU: Intel i7-10750H (12) @ 5.000GHz && Intel UHD Graphics
GPU: NVIDIA GeForce GTX 1650 Ti Mobile
Memory: 3541MiB / 15601MiB

$> nvidia-检测

Detected NVIDIA GPUs:
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117M [GeForce GTX 1650 Ti Mobile] [10de:1f95] (rev a1)

#> dkms 状态

nvidia-current/525.125.06: added

#> apt-cache 策略 nvidia-driver

nvidia-driver:
  Installed: 525.125.06-1~deb12u1
  Candidate: 525.125.06-1~deb12u1
  Version table:
     525.125.06-2 100
        100 http://deb.debian.org/debian unstable/non-free amd64 Packages
 *** 525.125.06-1~deb12u1 500
        500 http://ftp.uni-stuttgart.de/debian bookworm/non-free amd64 Packages
        100 /var/lib/dpkg/status

#> systemctl status nvidia-persistenced.service

× nvidia-persistenced.service - NVIDIA Persistence Daemon
     Loaded: loaded (/lib/systemd/system/nvidia-persistenced.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Mon 2023-09-18 10:56:26 CEST; 26min ago
    Process: 21064 ExecStart=/usr/bin/nvidia-persistenced --user nvpd (code=exited, status=1/FAILURE)
    Process: 21066 ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced (code=exited, status=0/SUCCESS)
        CPU: 2ms

Sep 18 10:56:26 spectre systemd[1]: Starting nvidia-persistenced.service - NVIDIA Persistence Daemon...
Sep 18 10:56:26 spectre nvidia-persistenced[21065]: Started (21065)
Sep 18 10:56:26 spectre nvidia-persistenced[21065]: Failed to open libnvidia-cfg.so.1: libnvidia-cfg.so.1: cannot open shared object file: No such file or directory
Sep 18 10:56:26 spectre nvidia-persistenced[21064]: nvidia-persistenced failed to initialize. Check syslog for more details.
Sep 18 10:56:26 spectre nvidia-persistenced[21065]: Shutdown (21065)
Sep 18 10:56:26 spectre systemd[1]: nvidia-persistenced.service: Control process exited, code=exited, status=1/FAILURE
Sep 18 10:56:26 spectre systemd[1]: nvidia-persistenced.service: Failed with result 'exit-code'.
Sep 18 10:56:26 spectre systemd[1]: Failed to start nvidia-persistenced.service - NVIDIA Persistence Daemon.

安全启动已关闭。

答案1

总是这样,因为您尝试加载的 nvidia 驱动程序不是为您刚刚更新到的内核制作的。

我猜应该检查你的系统软件包源存储库,/etc/apt/sources或者运行一个sudo grep -R nvidia /etc/apt/.来调查。在 ARCH 类型的系统中,检查或更新你的 pacman 缓存。

例如,我在 ubuntu 系统上安装了 cuda,为此我在系统中包含了来自 nvidia 的 ubuntu2204 源代码仓库。正如您所报告的那样,它失败了。从标准仓库中删除并重新安装正确的驱动程序后,一切正常。

当然,在更新系统存储库后,您始终还必须更新系统,即apt-get update && apt-get upgrade在 nvidia 情况下删除不兼容的驱动程序并重新安装正确的驱动程序。

相关内容