目标
我正在尝试在我的 nvidia 卡上使用 CUDA 进行研究。我不太想用它来管理我的显示器因为我计划在完成设置后仅通过 bash-shell 使用计算机。
问题
我的显卡未被 Ubuntu 认领。登录后跳转到登录循环。
背景
我是一名 Linux 爱好者、高级用户、计算机科学博士生,但我一直无法让我的 Nvidia gtx 1070Ti 显卡正常工作。两个多月以来,我每个星期天都在做这件事。
我已经遵循了以下教程:
https://help.ubuntu.com/community/BinaryDriverHowto/Nvidia
https://help.ubuntu.com/community/BinaryDriverHowto
https://kislayabhi.github.io/Installing_CUDA_with_Ubuntu/
https://askubuntu.com/a/760935/13693
https://askubuntu.com/a/937204/13693
http://docs.nvidia.com/cuda/cuda-installation-guide-linux
安装nvidia-current
或nvidia-387
(安装 ubuntu 时默认选择),或最新nvidia-390
导致启动循环,登录后我被反弹回登录屏幕。
因此我使用prime-select intel
并删除了modeset=0 blacklist
以进入工作桌面。以下是对我当前状态的回顾:
$ uname -a
Linux datalake2 4.13.0-36-generic #40~16.04.1-Ubuntu SMP Fri Feb 16 23:25:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ lspci | grep VGA
03:00.0 VGA compatible controller: NVIDIA Corporation Device 1b82 (rev a1)
08:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. G200eR2 (rev 01)
$ sudo lshw -C video
*-display UNCLAIMED
description: VGA compatible controller
product: NVIDIA Corporation
vendor: NVIDIA Corporation
physical id: 0
bus info: pci@0000:03:00.0
version: a1
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress vga_controller cap_list
configuration: latency=0
resources: iomemory:33f0-33ef iomemory:33f0-33ef memory:91000000-91ffffff memory:33fe0000000-33fefffffff memory:33ff0000000-33ff1ffffff ioport:2000(size=128) memory:92080000-920fffff
$ apt list --installed | grep "nvidia"
nvidia-387/unknown,now 387.26-0ubuntu1 amd64 [installed]
nvidia-387-dev/unknown,now 387.26-0ubuntu1 amd64 [installed,automatic]
nvidia-cuda-dev/xenial,now 7.5.18-0ubuntu1 amd64 [installed,automatic]
nvidia-cuda-doc/xenial,xenial,now 7.5.18-0ubuntu1 all [installed,automatic]
nvidia-cuda-gdb/xenial,now 7.5.18-0ubuntu1 amd64 [installed,automatic]
nvidia-cuda-toolkit/xenial,now 7.5.18-0ubuntu1 amd64 [installed]
nvidia-modprobe/unknown,now 387.26-0ubuntu1 amd64 [installed,automatic]
nvidia-opencl-dev/xenial,now 7.5.18-0ubuntu1 amd64 [installed,automatic]
nvidia-opencl-icd-387/unknown,now 387.26-0ubuntu1 amd64 [installed,automatic]
nvidia-prime/xenial,now 0.8.2 amd64 [installed]
nvidia-profiler/xenial,now 7.5.18-0ubuntu1 amd64 [installed,automatic]
nvidia-settings/unknown,now 387.26-0ubuntu1 amd64 [installed,automatic]
nvidia-visual-profiler/xenial,now 7.5.18-0ubuntu1 amd64 [installed,automatic]
$ cat /proc/driver/nvidia/version
cat: /proc/driver/nvidia/version: No such file or directory
怪异之处
我的第二个问题似乎是,尽管我已经启用了受限专有驱动程序,但 ubuntu 仍无法识别我的卡的驱动程序需求。
sudo software-properties-gtk
也没有给我任何东西。
答案1
解决方法如下:
1. 编辑/etc/default/grub
修改GRUB_CMDLINE_LINUX_DEFAULT
为
GRUB_CMDLINE_LINUX_DEFAULT='pcie_port_pm=off acpi_backlight=none acpi_osi=Linux acpi_osi=! acpi_osi="Windows 2009"'
此步骤是为了防止登录后出现黑屏。
2. 将 nvidia 库目录移动到/etc/ld.so.conf.d/nvidia.conf
的内容nvidia.conf
是
/usr/lib/nvidia-390
/usr/lib32/nvidia-390
这些目录取决于您计算机上的驱动程序版本。
3. 创建/etc/init.d/nvidia
禁用和启用 nvidia 运行时库。
#!/bin/sh
### BEGIN INIT INFO
# Provides: nvidia
# Required-Start: $all
# Required-Stop: $all
# Default-Start: 5
# Default-Stop: 0 6
# Short-Description: load/unload nvidia library
# Description: load/unload nvidia library
### END INIT INFO
PRIME=$(prime-select query)
if [ "$PRIME" = "nvidia" ]; then
exit 0
fi
case "$1" in
start)
sleep 10
cd /etc/ld.so.conf.d
mv nvidia.conf.bak nvidia.conf
ldconfig
nvidia-smi
;;
stop)
cd /etc/ld.so.conf.d
mv nvidia.conf nvidia.conf.bak
ldconfig
esac
4. 执行update-rc.d nvidia defaults
您应该会SXXnvidia
在/etc/rc5.d/
和KXXnvidia
中找到/etc/rc6.d/
。/etc/rc0.d/
尝试执行/etc/init.d/nvidia stop
和nvidia-smi
,您应该会看到未找到库的错误消息。
尝试执行/etc/init.d/nvidia start
,然后nvidia-smi
就又好了。
如果一切正常,您现在可以重新启动。您将登录到桌面。
5. 如果出现任何问题
最有可能的问题是nvidia
脚本没有被执行。如果发生这种情况,你可以按Ctrl+Alt+F1进入tty模式,执行/etc/init.d/nvidia stop; reboot
。然后你就可以回到unity桌面进行调试了。
6. 已知副作用
当使用英特尔作为主要 GPU 时,unity-control-center
(系统设置)将无法启动。
GLib-CRITICAL **: g_strsplit: assertion `string != NULL' failed.
注意:我的系统规格
# uname -r
4.13.0-32-generic
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.3 LTS
Release: 16.04
Codename: xenial
# dpkg -l | grep cuda
ii cuda-9-0 9.0.176-1 amd64 CUDA 9.0 meta-package
ii cuda-command-line-tools-9-0 9.0.176-1 amd64 CUDA command-line tools
ii cuda-core-9-0 9.0.176-1 amd64 CUDA core tools
ii cuda-cublas-9-0 9.0.176.1-1 amd64 CUBLAS native runtime libraries
ii cuda-cublas-dev-9-0 9.0.176.1-1 amd64 CUBLAS native dev links, headers
ii cuda-cudart-9-0 9.0.176-1 amd64 CUDA Runtime native Libraries
ii cuda-cudart-dev-9-0 9.0.176-1 amd64 CUDA Runtime native dev links, headers
ii cuda-cufft-9-0 9.0.176-1 amd64 CUFFT native runtime libraries
ii cuda-cufft-dev-9-0 9.0.176-1 amd64 CUFFT native dev links, headers
ii cuda-curand-9-0 9.0.176-1 amd64 CURAND native runtime libraries
ii cuda-curand-dev-9-0 9.0.176-1 amd64 CURAND native dev links, headers
ii cuda-cusolver-9-0 9.0.176-1 amd64 CUDA solver native runtime libraries
ii cuda-cusolver-dev-9-0 9.0.176-1 amd64 CUDA solver native dev links, headers
ii cuda-cusparse-9-0 9.0.176-1 amd64 CUSPARSE native runtime libraries
ii cuda-cusparse-dev-9-0 9.0.176-1 amd64 CUSPARSE native dev links, headers
ii cuda-demo-suite-9-0 9.0.176-1 amd64 Demo suite for CUDA
ii cuda-documentation-9-0 9.0.176-1 amd64 CUDA documentation
ii cuda-driver-dev-9-0 9.0.176-1 amd64 CUDA Driver native dev stub library
ii cuda-drivers 390.12-1 amd64 CUDA Driver meta-package
ii cuda-libraries-9-0 9.0.176-1 amd64 CUDA Libraries 9.0 meta-package
ii cuda-libraries-dev-9-0 9.0.176-1 amd64 CUDA Libraries 9.0 development meta-package
ii cuda-license-9-0 9.0.176-1 amd64 CUDA licenses
ii cuda-misc-headers-9-0 9.0.176-1 amd64 CUDA miscellaneous headers
ii cuda-npp-9-0 9.0.176-1 amd64 NPP native runtime libraries
ii cuda-npp-dev-9-0 9.0.176-1 amd64 NPP native dev links, headers
ii cuda-nvgraph-9-0 9.0.176-1 amd64 NVGRAPH native runtime libraries
ii cuda-nvgraph-dev-9-0 9.0.176-1 amd64 NVGRAPH native dev links, headers
ii cuda-nvml-dev-9-0 9.0.176-1 amd64 NVML native dev links, headers
ii cuda-nvrtc-9-0 9.0.176-1 amd64 NVRTC native runtime libraries
ii cuda-nvrtc-dev-9-0 9.0.176-1 amd64 NVRTC native dev links, headers
ii cuda-repo-ubuntu1604 9.1.85-1 amd64 cuda repository configuration files
ii cuda-runtime-9-0 9.0.176-1 amd64 CUDA Runtime 9.0 meta-package
ii cuda-samples-9-0 9.0.176-1 amd64 CUDA example applications
ii cuda-toolkit-9-0 9.0.176-1 amd64 CUDA Toolkit 9.0 meta-package
ii cuda-visual-tools-9-0 9.0.176-1 amd64 CUDA visual tools
ii libcuda1-390 390.12-0ubuntu1 amd64 NVIDIA CUDA runtime library
ii libcudnn7 7.0.5.15-1+cuda9.0 amd64 cuDNN runtime libraries
ii libcudnn7-dev 7.0.5.15-1+cuda9.0 amd64 cuDNN development libraries and headers
# dpkg -l | grep nvidia
ii nvidia-390 390.12-0ubuntu1 amd64 NVIDIA binary driver - version 390.12
ii nvidia-390-dev 390.12-0ubuntu1 amd64 NVIDIA binary Xorg driver development files
ii nvidia-modprobe 390.12-0ubuntu1 amd64 Load the NVIDIA kernel driver and create device files
ii nvidia-opencl-icd-390 390.12-0ubuntu1 amd64 NVIDIA OpenCL ICD
ii nvidia-prime 0.8.2 amd64 Tools to enable NVIDIA's Prime
ii nvidia-settings 390.12-0ubuntu1 amd64 Tool for configuring the NVIDIA graphics driver
答案2
你应该能够让 CUDA 与这个答案。经过平楚雄如果之后你仍然遇到登录循环问题,可以这里有一些评价很高的答案这应该可以帮你解决这个问题。
注意:就像生活中的大多数事情一样,如果你尝试安装多个版本或安装失败,Nvidia 驱动程序可能会留下一堆垃圾,可能需要清除所有然后重新安装您过去使用过的那个以获得所需的结果。