我正在使用装有 Ubuntu 22.04 的虚拟机并尝试进行 GPU 直通。
我的虚拟机上检测到了 Nvidia 显卡
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
我确实安装了推荐的 nvidia 驱动程序
ubuntu-drivers devices
哪个是 nvidia-drivers-390
然而,当我运行 nvidia-smi 时,它给了我
Failed to initialize NVML: Driver/library version mismatch
进一步探究根本原因,我实际上发现
NVRM: API mismatch: the client has version 530.30.02, but
NVRM: this kernel module has version 390.157. Please
NVRM: make sure that this kernel module and all NVIDIA driver
NVRM: components have the same version.
但是我的显卡是 GT 740M,所以最高只支持 nvidia-drivers-470。由于客户端 API 530 版本不在已安装的驱动程序列表中,我该如何删除它?
dpkg --get-selections | grep nvidia
libnvidia-cfg1-390:amd64 install
libnvidia-common-390 install
libnvidia-compute-390:amd64 install
libnvidia-compute-418-server:amd64 deinstall
libnvidia-decode-390:amd64 install
libnvidia-encode-390:amd64 install
libnvidia-fbc1-390:amd64 install
libnvidia-gl-390:amd64 install
libnvidia-ifr1-390:amd64 install
nvidia-compute-utils-390 install
nvidia-dkms-390 install
nvidia-driver-390 install
nvidia-kernel-common-390 install
nvidia-kernel-source-390 install
nvidia-prime install
nvidia-settings install
nvidia-utils-390 install
xserver-xorg-video-nvidia-390 install
更新:
我确实使用以下命令找到了一些文件夹
sudo find /usr/lib -iname "*nvidia*530*"
我删除了它,然后按照@Saxtheowl 的回答操作,现在有 Nvidia-520 文件了。肯定是巫术 :/
答案1
您可以尝试完全删除现有的 Nvidia 驱动程序并重新安装适当的版本
首先清除 Nvidia 驱动程序
sudo apt-get purge '^nvidia.*'
sudo rm -rf /etc/X11/xorg.conf
sudo rm -rf /etc/modprobe.d/nvidia*
sudo rm -rf /usr/local/nvidia*
sudo rm -rf /usr/lib/xorg/modules/drivers/nvidia*
然后添加适当的目录
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
然后安装好的驱动程序sudo apt-get install nvidia-driver-470
,然后禁用新驱动程序
sudo bash -c "echo 'blacklist nouveau' > /etc/modprobe.d/blacklist-nouveau.conf"
sudo bash -c "echo 'options nouveau modeset=0' >> /etc/modprobe.d/blacklist-nouveau.conf"
然后更新初始 ramdisksudo update-initramfs -u
然后重新启动
如果仍然有错误,请检查sudo apt-get install nvidia-settings
以确保所有组件都已正确升级
答案2
我的笔记本电脑也遇到了同样的问题Ubuntu 22.04.4 LTS
!
执行以下步骤:
dpkg -l | grep nvidia
不是必需的,只是检查安装了哪些版本
sudo apt --purge remove "*nvidia*"
sudo ubuntu-drivers devices
查看所有需要驱动程序的硬件 NVidia 设备以及哪些软件包(也可以是sudo ubuntu-drivers list
其他任何选项)
然后安装默认以及您机器上最新的可用驱动程序:(这是最安全的选择)
sudo ubuntu-drivers install
sudo reboot
也会有帮助(但可选)
现在它应该可以正常工作了:
$ nvidia-smi
Fri Apr 5 14:21:44 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.06 Driver Version: 545.29.06 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A1000 6GB Lap... Off | 00000000:01:00.0 Off | N/A |
| N/A 57C P0 312W / 35W | 8MiB / 6144MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 3071 G /usr/lib/xorg/Xorg 4MiB |
+---------------------------------------------------------------------------------------+