我的电脑(ubuntu18.04LTS)上有以下 nvidia 显卡:
Intel i5 9600K
NVIDIA GeForce RTX2070
我已经按照以下方式安装了 cuda 和 nvidia 驱动程序
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
sudo apt update
sudo apt install cuda cuda-drivers
sudo reboot
然后
nvidia-smi
NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system.
Please also try adding directory that contains libnvidia-ml.so to your system PATH.
然后我搜索 libnvidia-ml.so
ls /usr/lib/nvidia
pre-install
有什么建议可以让它工作吗?nvidia-smi 在哪里尝试找到 libnvidia-ml.so?
答案1
我解决了这个问题。删除所有 cuda 和 nvidia 驱动程序
sudo apt-get --purge remove nvidia-*
sudo apt-get --purge remove cuda-*
然后
自动安装驱动程序
sudo ubuntu-drivers autoinstall
sudo reboot
检查驱动程序是否安装成功
nvidia-smi
然后,安装 cuda-10.0(我将使用 tensorflow=1.13.1)
sudo apt install nvidia-driver-418
sudo apt-get install cuda-10.0
安装 cudnn
echo "deb https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 /" | sudo tee /etc/apt/sources.list.d/nvidia-ml.list
sudo apt update
sudo apt install libcudnn7-dev=7.5.0.56-1+cuda10.0
设置路径
sudo cp -a cuda/lib64/* /usr/lib/cuda/lib64/
sudo cp -a cuda/include/* /usr/lib/cuda/include/