我已经花了几天的时间在这个问题上,并且不知道该如何继续下去。
我有一台华硕 Zenbook UX303UB,装有 Windows 10 和 Ubuntu 16.04 双启动系统。它有一个 NVIDIA GeForce 940M 专用 GPU,内存为 2GB。我主要用 Ubuntu 进行编程。我想尝试一些深度学习工具,比如 tensorflow 和 theano,为此我首先需要 CUDA。只有 CUDA 8.0rc 似乎可以正式与 Ubuntu 16.04 配合使用。
首先,我访问了 NVIDIA 网站并下载了 CUDA 8.0 运行文件。我按照他们的说明进行安装,并在 TTY1 中执行了安装,包括将 noveau 列入黑名单并添加
export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
到.bashrc文件的末尾。
当我尝试 NVIDIA 提供的测试示例时,
$ cd NVIDIA_CUDA-8.0_Samples/5_Simulations/nbody
$ make
我的输出是:
>>> WARNING - libGLU.so not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
>>> WARNING - gl.h not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
>>> WARNING - glu.h not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
[@] /usr/local/cuda-8.0/bin/nvcc -ccbin g++ -I../../common/inc -m64 -ftz=true -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o bodysystemcuda.o -c bodysystemcuda.cu
[@] /usr/local/cuda-8.0/bin/nvcc -ccbin g++ -I../../common/inc -m64 -ftz=true -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o nbody.o -c nbody.cpp
[@] /usr/local/cuda-8.0/bin/nvcc -ccbin g++ -I../../common/inc -m64 -ftz=true -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o render_particles.o -c render_particles.cpp
[@] /usr/local/cuda-8.0/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o nbody bodysystemcuda.o nbody.o render_particles.o -L/usr/lib/nvidia-361 -lGL -lGLU -lX11 -lglut
[@] mkdir -p ../../bin/x86_64/linux/release
[@] cp nbody ../../bin/x86_64/linux/release
当我这样做的时候
$ ./nbody -benchmark -numbodies=256000 -device=0
我明白了
bash: ./nbody: No such file or directory
在诊断过程中,我意识到了两件事:
1)nvidia-smi
似乎不起作用(nvidia-smi: command not found
)
2)当我这样做时,cat /proc/driver/nvidia/version
我得到了cat: /proc/driver/nvidia/version: No such file or directory
然后我决定也许 NVIDIA 驱动程序有问题。所以我按照 apt-get 升级后 Nvidia 显卡驱动程序和 CUDA 出现问题
基本上,要清除所有 nvidia 驱动程序,请关闭 lightdm 并进入运行级别 3,然后从 NVIDIA 网站安装 NVIDIA 驱动程序运行文件。
但是,出现安装错误并且中止。
然后我重新启动并清除所有 nvidia 驱动程序,然后执行sudo apt-get install nvidia-367
现在我又回到原点了。其他诊断信息如下:
$ sudo nvidia-modprobe
sudo: nvidia-modprobe: command not found
$ uname -r
4.4.0-36-generic
$ dpkg -l | grep ii | grep -i linux-headers
ii linux-headers-4.4.0-31 4.4.0-31.50 all Header files related to Linux kernel version 4.4.0
ii linux-headers-4.4.0-31-generic 4.4.0-31.50 amd64 Linux kernel headers for version 4.4.0 on 64 bit x86 SMP
ii linux-headers-4.4.0-34 4.4.0-34.53 all Header files related to Linux kernel version 4.4.0
ii linux-headers-4.4.0-34-generic 4.4.0-34.53 amd64 Linux kernel headers for version 4.4.0 on 64 bit x86 SMP
ii linux-headers-4.4.0-36 4.4.0-36.55 all Header files related to Linux kernel version 4.4.0
ii linux-headers-4.4.0-36-generic 4.4.0-36.55 amd64 Linux kernel headers for version 4.4.0 on 64 bit x86 SMP
ii linux-headers-generic 4.4.0.36.38 amd64 Generic Linux kernel headers
$ dpkg -l | grep -i nvidia
ii bbswitch-dkms 0.8-3ubuntu1 amd64 Interface for toggling the power on NVIDIA Optimus video cards
ii libcuda1-367 367.44-0ubuntu0~gpu16.04.1 amd64 NVIDIA CUDA runtime library
ii nvidia-367 367.44-0ubuntu0~gpu16.04.1 amd64 NVIDIA binary driver - version 367.44
ii nvidia-opencl-icd-367 367.44-0ubuntu0~gpu16.04.1 amd64 NVIDIA OpenCL ICD
ii nvidia-prime 0.8.2 amd64 Tools to enable NVIDIA's Prime
ii nvidia-settings 370.23-0ubuntu0~gpu16.04.1 amd64 Tool for configuring the NVIDIA graphics driver
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Wed_May__4_21:01:56_CDT_2016
Cuda compilation tools, release 8.0, V8.0.26
任何帮助都将非常感谢,因为我有好几次差点就破坏了 ubuntu!!
答案1
我按照本教程解决了这个问题:http://kislayabhi.github.io/Installing_CUDA_with_Ubuntu/
大致按照如下步骤进行操作,即从 nvidia 网站下载并安装最新的驱动程序,而不是按照教程推荐的下载并安装 nvidia-367。