无法在 Ubuntu 18.04 上使用 NVIDIA GPU 执行计算机视觉任务（GeForce GTX 1060）

2024-6-10 • tag-icon

无法在 Ubuntu 18.04 上使用 NVIDIA GPU 执行计算机视觉任务（GeForce GTX 1060）

我的机器是华硕 ROG SCAR-GL703GM-EE033T，配备 NVIDIA GeForce GTX1060。我在其上安装了 Ubuntu 18.04 以及使用其 GPU 运行深度学习应用程序所需的必要软件。

我安装了：

Cuda 10.0
cuDNN 7.5.0
tensorflow-gpu 1.13.1

一切看起来都很好，就像在 Python 终端中运行以下命令一样：

tensorflow.test.is_gpu_available()

它输出 NVIDIA 卡的特性和“True”。因此安装看起来正确。

然而，当运行使用 pytesseract 的计算机视觉应用程序时，我对长时间的运行时间感到惊讶，这可能表明 GPU 并没有真正被使用。为了验证这一点，当代码运行时，我在另一个终端中显示nvidia-smi输出的命令：这是代码运行时 nvidia-smi 命令的屏幕截图

如您所见，NVIDIA 驱动程序为 418.67。它不是 NVIDIA 网站推荐的版本（这里使用 GeForce / GeForce 10 系列 / GeForce GTX1060 / Linux 64 位），但我尝试安装驱动程序 430.67 失败，并显示以下消息：

nvidia-installer log file '/var/log/nvidia-installer.log'
creation time: Thu Jul 11 17:56:00 2019
installer version: 430.34

PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin

nvidia-installer command line:
    ./nvidia-installer

Unable to load: nvidia-installer ncurses v6 user interface

Using: nvidia-installer ncurses user interface
-> Detected 12 CPUs online; setting concurrency level to 12.
ERROR: An NVIDIA kernel module 'nvidia-uvm' appears to already be loaded in your kernel.  This may be because it is in use (for example, by an X server, a CUDA program, or the NVIDIA Persistence Daemon), but this may also happen if your kernel was configured without support for module unloading.  Please be sure to exit any programs that may be using the GPU(s) before attempting to upgrade your driver.  If no GPU-based programs are running, you know that your kernel supports module unloading, and you still receive this message, then an error may have occured that has corrupted an NVIDIA kernel module's usage count, for which the simplest remedy is to reboot your computer.
ERROR: Installation has failed.  Please see the file '/var/log/nvidia-installer.log' for details.  You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.

然后我尝试使用驱动程序 418，最终 TensorFlow 识别出可用的 GPU，因此我继续使用它。

此外，如您在屏幕截图中看到的，没有任何进程使用 GPU。并且如您在屏幕截图中看到的，显示的 Cuda 版本是 10.1，尽管在安装看起来效果最好的 10.0 之前，我删除了之前安装的 10.1 的所有痕迹。为什么要nvidia-smi获得这个 10.1 版本的 Cuda？

您知道吗，在我当前的配置/安装下，如何永久地使 GPU 可用于运行我的代码？您是否发现我的安装中缺少了什么？

谢谢您的帮助，我尽力提供尽可能多的详细信息！

西奥

相关内容