无法使 NVidia GPU 在 Ubuntu 18.04(华硕笔记本)上使用

无法使 NVidia GPU 在 Ubuntu 18.04(华硕笔记本)上使用

我已经尝试了所有能找到的方法,让华硕 TUF FX504GE 上的 Ubuntu 18.04 识别其中的 GTX 1050Ti。没有什么能让 nvidia-smi 输出它应该输出的内容。

我已经尝试过了:

我有:

  • 笔记本华硕 TUF Gaming FX504GE 系列,GTX 1050Ti

  • 内核版本:5.3.0-26-通用

  • Ubuntu 版本:18.04.03 长期支持

  • 软件与更新

    软件与更新

  • 关于 Ubuntu:

    关于 Ubuntu

  • lshw视频输出

$ sudo lshw -c video
  *-display UNCLAIMED       
       description: 3D controller
       product: GP107M [GeForce GTX 1050 Ti Mobile]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:01:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress bus_master cap_list
       configuration: latency=0
       resources: memory:a3000000-a3ffffff memory:90000000-9fffffff memory:a0000000-a1ffffff ioport:4000(size=128) memory:a4000000-a407ffff
  *-display
       description: VGA compatible controller
       product: Intel Corporation
       vendor: Intel Corporation
       physical id: 2
       bus info: pci@0000:00:02.0
       version: 00
       width: 64 bits
       clock: 33MHz
       capabilities: pciexpress msi pm vga_controller bus_master cap_list rom
       configuration: driver=i915 latency=0
       resources: irq:143 memory:a2000000-a2ffffff memory:80000000-8fffffff ioport:5000(size=64) memory:c0000-dffff
  • NVidia 相关安装的软件包:
$ apt list --installed | grep -P 'nvidia|cuda'

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

libnvidia-cfg1-415/bionic,now 415.27-0ubuntu0~gpu18.04.2 amd64 [installed,automatic]
libnvidia-common-415/bionic,bionic,now 415.27-0ubuntu0~gpu18.04.2 all [installed,automatic]
libnvidia-compute-415/bionic,now 415.27-0ubuntu0~gpu18.04.2 amd64 [installed,automatic]
libnvidia-decode-415/bionic,now 415.27-0ubuntu0~gpu18.04.2 amd64 [installed,automatic]
libnvidia-encode-415/bionic,now 415.27-0ubuntu0~gpu18.04.2 amd64 [installed,automatic]
libnvidia-fbc1-415/bionic,now 415.27-0ubuntu0~gpu18.04.2 amd64 [installed,automatic]
libnvidia-gl-415/bionic,now 415.27-0ubuntu0~gpu18.04.2 amd64 [installed,automatic]
libnvidia-ifr1-415/bionic,now 415.27-0ubuntu0~gpu18.04.2 amd64 [installed,automatic]
nvidia-compute-utils-415/bionic,now 415.27-0ubuntu0~gpu18.04.2 amd64 [installed,automatic]
nvidia-dkms-415/bionic,now 415.27-0ubuntu0~gpu18.04.2 amd64 [installed,automatic]
nvidia-driver-415/bionic,now 415.27-0ubuntu0~gpu18.04.2 amd64 [installed]
nvidia-kernel-common-415/bionic,now 415.27-0ubuntu0~gpu18.04.2 amd64 [installed,automatic]
nvidia-kernel-source-415/bionic,now 415.27-0ubuntu0~gpu18.04.2 amd64 [installed,automatic]
nvidia-prime/bionic-updates,bionic-updates,now 0.8.8.2 all [installed,automatic]
nvidia-settings/unknown,now 440.33.01-0ubuntu1 amd64 [installed,automatic]
nvidia-utils-415/bionic,now 415.27-0ubuntu0~gpu18.04.2 amd64 [installed,automatic]
xserver-xorg-video-nvidia-415/bionic,now 415.27-0ubuntu0~gpu18.04.2 amd64 [installed,automatic]
  • Xorg 关于 NVidia 的日志
$ cat /var/log/Xorg.0.log | grep -i nvidia
[    13.485] (**) OutputClass "nvidia" ModulePath extended to "/usr/lib/x86_64-linux-gnu/nvidia/xorg,/usr/lib/xorg/modules"
[    13.485] (**) OutputClass "Nvidia Prime" ModulePath extended to "/x86_64-linux-gnu/nvidia/xorg,/usr/lib/x86_64-linux-gnu/nvidia/xorg,/usr/lib/xorg/modules"
[    13.485] (**) OutputClass "Nvidia Prime" setting /dev/dri/card1 as PrimaryGPU
[    13.493] (II) Applying OutputClass "nvidia" to /dev/dri/card1
[    13.493]    loading driver: nvidia
[    13.493] (II) Applying OutputClass "Nvidia Prime" to /dev/dri/card1
[    13.493]    loading driver: nvidia
[    13.729] (==) Matched nvidia as autoconfigured driver 0
[    13.729] (II) LoadModule: "nvidia"
[    13.729] (II) Loading /usr/lib/x86_64-linux-gnu/nvidia/xorg/nvidia_drv.so
[    13.740] (II) Module nvidia: vendor="NVIDIA Corporation"
[    13.748] (II) NVIDIA dlloader X Driver  440.26  Sun Oct 13 17:46:52 UTC 2019
[    13.748] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[    13.748] (II) NOUVEAU driver for NVIDIA chipset families :
[    13.759] (II) NVIDIA(0): Creating default Display subsection in Screen section
[    13.759] (==) NVIDIA(0): Depth 24, (==) framebuffer bpp 32
[    13.760] (==) NVIDIA(0): RGB weight 888
[    13.760] (==) NVIDIA(0): Default visual is TrueColor
[    13.760] (==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
[    13.760] (II) Applying OutputClass "nvidia" options to /dev/dri/card1
[    13.760] (II) Applying OutputClass "Nvidia Prime" options to /dev/dri/card1
[    13.760] (**) NVIDIA(0): Option "AllowEmptyInitialConfiguration"
[    13.760] (**) NVIDIA(0): Option "IgnoreDisplayDevices" "CRT"
[    13.760] (**) NVIDIA(0): Enabling 2D acceleration
[    13.761] (II) Loading sub module "glxserver_nvidia"
[    13.761] (II) LoadModule: "glxserver_nvidia"
[    13.761] (II) Loading /usr/lib/x86_64-linux-gnu/nvidia/xorg/libglxserver_nvidia.so
[    13.819] (II) Module glxserver_nvidia: vendor="NVIDIA Corporation"
[    13.819] (II) NVIDIA GLX Module  440.26  Sun Oct 13 17:44:48 UTC 2019
[    13.821] (II) NVIDIA: The X server does not support PRIME Render Offload.
[    13.825] (II) NVIDIA(0): NVIDIA GPU GeForce GTX 1050 Ti (GP107-A) at PCI:1:0:0 (GPU-0)
[    13.825] (--) NVIDIA(0): Memory: 4194304 kBytes
[    13.825] (--) NVIDIA(0): VideoBIOS: 86.07.50.00.59
[    13.825] (II) NVIDIA(0): Detected PCI Express Link width: 16X
[    13.825] (II) NVIDIA(0): Validated MetaModes:
[    13.825] (II) NVIDIA(0):     "NULL"
[    13.825] (II) NVIDIA(0): Virtual screen size determined to be 640 x 480
[    13.825] (WW) NVIDIA(0): Unable to get display device for DPI computation.
[    13.825] (==) NVIDIA(0): DPI set to (75, 75); computed from built-in default
[    13.980] (II) NVIDIA: Using 24576.00 MB of virtual memory for indirect memory
[    13.980] (II) NVIDIA:     access.
[    13.997] (II) NVIDIA(0): Setting mode "NULL"
[    13.999] (==) NVIDIA(0): Disabling shared memory pixmaps
[    13.999] (==) NVIDIA(0): Backing store enabled
[    13.999] (==) NVIDIA(0): Silken mouse enabled
[    13.999] (==) NVIDIA(0): DPMS enabled
[    13.999] (WW) NVIDIA(0): Option "PrimaryGPU" is not used
[    13.999] (II) NVIDIA(0): [DRI2] Setup complete
[    13.999] (II) NVIDIA(0): [DRI2]   VDPAU driver: nvidia
[  1363.870] (II) NVIDIA(0): Setting mode "NULL"
[  3115.004] (II) NVIDIA(GPU-0): Deleting GPU-0

更新

我第 1000 次这样做了sudo apt purge *nvidia*,然后sudo apt install nvidia-driver-440。并且,按照朋友的建议,我还安装了熊蜂

在此之后,nvidia-smi终于显示出一些东西:

Mon Jan 27 13:22:32 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.48.02    Driver Version: 440.48.02    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 105...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   46C    P8    N/A /  N/A |      0MiB /  4042MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

虽然它已经是一个伟大的提前,似乎我还不能使用 GPU 处理......(像 glmark2 之类的东西仍然可以在英特尔上运行)......

答案1

最后,大黄蜂成功了。

但为了使它发挥作用,很多经过反复尝试,我做了以下事情:

1. 清除所有与 NVidia 相关的内容

我的意思是说: sudo apt purge *nvidia*

请注意,我没有这样做,nvidia*因为它留下了类似的东西libnvidia-whatever.so

2. 重启

此后,笔记本电脑无法正常启动:黑屏几秒钟后突然关机。我不得不在选择 Ubuntu 之前编辑 GRUB,将 nouveau.modeset 设置为 0 才能继续。

怎么做:

  • 在 grub 屏幕上,将选择光标放在 Ubuntu 上,按e打开文本编辑器
  • 将以下内容附加到 Linux 行(以空格分隔):nouveau.modeset=0
  • 按下Ctrl-x继续启动
  • 这足以使笔记本电脑正常启动并继续进行 NVidia 的驱动程序任务。

3.安装 NVidia 驱动程序

根据ubuntu-driver devices(此命令挂起一点以显示其输出):

ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
modalias : pci:v000010DEd00001C8Csv00001043sd000018FEbc03sc02i00
vendor   : NVIDIA Corporation
model    : GP107M [GeForce GTX 1050 Ti Mobile]
driver   : nvidia-driver-440 - third-party free recommended
driver   : nvidia-driver-415 - third-party free
driver   : nvidia-driver-430 - third-party free
driver   : nvidia-driver-435 - distro non-free
driver   : nvidia-driver-390 - third-party free
driver   : nvidia-driver-410 - third-party free
driver   : xserver-xorg-video-nouveau - distro free builtin

== /sys/devices/pci0000:00/0000:00:14.3 ==
modalias : pci:v00008086d0000A370sv00008086sd00000034bc02sc80i00
vendor   : Intel Corporation
manual_install: True
driver   : backport-iwlwifi-dkms - distro free

nvidia-driver-440推荐。所以我安装了它:sudo apt install nvidia-driver-440

4. 安装 Bumblebee

我关注了14.04 及更高版本的设置熊蜂

重启后,nvidia-smi就有输出了(正如我在问题更新中所说的那样)。

然后我尝试运行一些tensorflow-gpu测试并且它在 GPU 上运行(如图所示nvidia-smi)!

谢谢大家的帮助:)

答案2

您是否尝试过进入软件和更新菜单?

在此处输入图片描述

我在安装 Ubuntu 期间通过允许第三方驱动程序安装了我的驱动程序,但仍然必须到这里来选择它。

希望能帮助到你。

相关内容