我试图安装 CUDA 工具包,因此我去了推荐的 nvidia.developer.com 并使用以下 deb(网络)安装了 CUDA Toolkit 10.2:
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo add-apt-repository "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
sudo apt-get update
sudo apt-get -y install cuda
在此过程中,我一定搞砸了,没有禁用安全启动,因为安装程序提示我输入稍后需要的密码。重启后,系统不再要求我输入此密码。
由于nvidia-smi
要返回,NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver.
我尝试了一个建议,并将它列入黑名单nvidiafb
,/etc/modprobe.d/blacklist-framebuffer.conf/
结果sudo update-initramfs -u
导致我的启动屏幕在重启后冻结。设法让它再次工作,没有列入黑名单,现在使用开源 440 驱动程序。
我在 Ubuntu 18.04 上,配备 GeForce GTX 950M。
相当愚蠢的是,我运行sudo apt install nvidia-cuda-toolkit
并安装了 Cuda 9.1.85。
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
现在我试图卸载 CUDA-10.2,但由于一些共享依赖项而无法卸载 - 记不清了。我尝试了几个建议:
sudo apt-get remove cuda-10.2
sudo apt --fix-broken install
sudo apt-get --purge remove cuda-10.2
sudo apt-get remove --dry-run cuda-10.2
sudo apt-get autoclean
sudo apt-get autoremove
sudo apt --fix-broken install
sudo apt-get -o Dpkg::Options::="--force-overwrite" install --fix-broken
最后,由于缺少软件包,我甚至无法删除任何 CUDA。我找不到有关此问题的任何信息,因此在这里提问。
dan@dann:~$ sudo apt-get --purge autoremove cuda*
[sudo] password for dan:
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package cuda-workspace
编辑:添加我有一个文件夹“cuda-workspace”,里面有“.metadata”文件夹。我相信我在安装 Cuda 10.2 时创建了这个。
dan@dann:~$ whereis cuda-workspace
cuda-workspace:
dan@dann:~$ dpkg -S /home/dan/cuda-workspace
dpkg-query: no path found matching pattern /home/dan/cuda-workspace
dan@dann:~$ dpkg -l | grep -i cuda
rc cuda-cudart-10-2 10.2.89-1 amd64 CUDA Runtime native Libraries
rc cuda-cudart-dev-10-2 10.2.89-1 amd64 CUDA Runtime native dev links, headers
rc cuda-cufft-10-2 10.2.89-1 amd64 CUFFT native runtime libraries
rc cuda-cupti-10-2 10.2.89-1 amd64 CUDA profiling tools runtime libs.
rc cuda-curand-10-2 10.2.89-1 amd64 CURAND native runtime libraries
rc cuda-cusolver-10-2 10.2.89-1 amd64 CUDA solver native runtime libraries
rc cuda-cusparse-10-2 10.2.89-1 amd64 CUSPARSE native runtime libraries
rc cuda-npp-10-2 10.2.89-1 amd64 NPP native runtime libraries
rc cuda-nvcc-10-2 10.2.89-1 amd64 CUDA nvcc
rc cuda-nvgraph-10-2 10.2.89-1 amd64 NVGRAPH native runtime libraries
rc cuda-nvjpeg-10-2 10.2.89-1 amd64 NVJPEG native runtime libraries
rc cuda-nvprof-10-2 10.2.89-1 amd64 CUDA Profiler tools
rc cuda-nvrtc-10-2 10.2.89-1 amd64 NVRTC native runtime libraries
rc cuda-nvtx-10-2 10.2.89-1 amd64 NVIDIA Tools Extension
rc cuda-sanitizer-api-10-2 10.2.89-1 amd64 CUDA Sanitizer API
rc cuda-toolkit-10-2 10.2.89-1 amd64 CUDA Toolkit 10.2 meta-package
rc cuda-visual-tools-10-2 10.2.89-1 amd64 CUDA visual tools
ii libcudart9.1:amd64 9.1.85-3ubuntu1 amd64 NVIDIA CUDA Runtime Library
ii libnvrtc9.1:amd64 9.1.85-3ubuntu1 amd64 CUDA Runtime Compilation (NVIDIA NVRTC Library)
ii nvidia-cuda-dev 9.1.85-3ubuntu1 amd64 NVIDIA CUDA development files
ii nvidia-cuda-doc 9.1.85-3ubuntu1 all NVIDIA CUDA and OpenCL documentation
ii nvidia-cuda-gdb 9.1.85-3ubuntu1 amd64 NVIDIA CUDA Debugger (GDB)
ii nvidia-cuda-toolkit 9.1.85-3ubuntu1 amd64 NVIDIA CUDA development toolkit
ii nvidia-profiler 9.1.85-3ubuntu1 amd64 NVIDIA Profiler for CUDA and OpenCL
ii nvidia-visual-profiler 9.1.85-3ubuntu1 amd64 NVIDIA Visual Profiler for CUDA and OpenCL
我最终想保留驱动程序和正确安装的 CUDA,确认一切正常nvcc --version
,nvidia-smi
并尝试将 Pytorch 与 CUDA 一起使用(torch.cuda.is_available()
仍然返回 False)。
我是一个非常明显(而且愚蠢)的初学者...任何帮助都将非常有帮助!;)