Ubuntu GPU 缺少内核驱动程序

Ubuntu GPU 缺少内核驱动程序

我正在运行多个 GPU (GTX 1060 x6),但其中两个没有响应。我使用 lspci -nnk(下面的输出)查询驱动程序,结果显示 4 个 GPU 具有“正在使用的内核驱动程序:nvidia”,而另外两个 GPU 没有列出“内核驱动程序”。我在 4.4.0-104-generic 上运行 Ubuntu 16.04 LTS,并安装了用于张量流的 CUDA 8.0 和 nvidia-387 驱动程序(开源)。有什么想法为什么内核驱动程序没有显示吗?

00:00.0 Host bridge [0600]: Intel Corporation Sky Lake Host Bridge/DRAM Registers [8086:190f] (rev 07)
    Subsystem: ASUSTeK Computer Inc. Skylake Host Bridge/DRAM Registers [1043:8694]
00:01.0 PCI bridge [0604]: Intel Corporation Sky Lake PCIe Controller (x16) [8086:1901] (rev 07)
    Kernel driver in use: pcieport
    Kernel modules: shpchp
00:14.0 USB controller [0c03]: Intel Corporation Device [8086:a2af]
    Subsystem: ASUSTeK Computer Inc. Device [1043:8694]
    Kernel driver in use: xhci_hcd
00:16.0 Communication controller [0780]: Intel Corporation Device [8086:a2ba]
    Subsystem: ASUSTeK Computer Inc. Device [1043:8694]
    Kernel driver in use: mei_me
    Kernel modules: mei_me
00:17.0 SATA controller [0106]: Intel Corporation Device [8086:a282]
    Subsystem: ASUSTeK Computer Inc. Device [1043:8694]
    Kernel driver in use: ahci
    Kernel modules: ahci
00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:a294] (rev f0)
    Kernel driver in use: pcieport
    Kernel modules: shpchp
00:1c.6 PCI bridge [0604]: Intel Corporation Device [8086:a296] (rev f0)
    Kernel driver in use: pcieport
    Kernel modules: shpchp
00:1c.7 PCI bridge [0604]: Intel Corporation Device [8086:a297] (rev f0)
    Kernel driver in use: pcieport
    Kernel modules: shpchp
00:1d.0 PCI bridge [0604]: Intel Corporation Device [8086:a298] (rev f0)
    Kernel driver in use: pcieport
    Kernel modules: shpchp
00:1d.2 PCI bridge [0604]: Intel Corporation Device [8086:a29a] (rev f0)
    Kernel driver in use: pcieport
    Kernel modules: shpchp
00:1d.3 PCI bridge [0604]: Intel Corporation Device [8086:a29b] (rev f0)
    Kernel driver in use: pcieport
    Kernel modules: shpchp
00:1f.0 ISA bridge [0601]: Intel Corporation Device [8086:a2c8]
    Subsystem: ASUSTeK Computer Inc. Device [1043:8694]
00:1f.2 Memory controller [0580]: Intel Corporation Device [8086:a2a1]
    Subsystem: ASUSTeK Computer Inc. Device [1043:8694]
00:1f.3 Audio device [0403]: Intel Corporation Device [8086:a2f0]
    Subsystem: ASUSTeK Computer Inc. Device [1043:8723]
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel
00:1f.4 SMBus [0c05]: Intel Corporation Device [8086:a2a3]
    Subsystem: ASUSTeK Computer Inc. Device [1043:8694]
    Kernel modules: i2c_i801
00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-V [8086:15b8]
    Subsystem: ASUSTeK Computer Inc. Ethernet Connection (2) I219-V [1043:8672]
    Kernel driver in use: e1000e
    Kernel modules: e1000e
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1c03] (rev a1)
    Subsystem: eVga.com. Corp. Device [3842:6163]
    Kernel driver in use: nvidia
    Kernel modules: nvidiafb, nouveau, nvidia_387_drm, nvidia_387
01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f1] (rev a1)
    Subsystem: eVga.com. Corp. Device [3842:6163]
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1c03] (rev a1)
    Subsystem: eVga.com. Corp. Device [3842:6161]
    Kernel driver in use: nvidia
    Kernel modules: nvidiafb, nouveau, nvidia_387_drm, nvidia_387
02:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f1] (rev a1)
    Subsystem: eVga.com. Corp. Device [3842:6161]
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1c03] (rev a1)
    Subsystem: eVga.com. Corp. Device [3842:6163]
    Kernel driver in use: nvidia
    Kernel modules: nvidiafb, nouveau, nvidia_387_drm, nvidia_387
03:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f1] (rev a1)
    Subsystem: eVga.com. Corp. Device [3842:6163]
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel
04:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1c03] (rev a1)
    Subsystem: eVga.com. Corp. Device [3842:6161]
    Kernel driver in use: nvidia
    Kernel modules: nvidiafb, nouveau, nvidia_387_drm, nvidia_387
04:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f1] (rev a1)
    Subsystem: eVga.com. Corp. Device [3842:6161]
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel
06:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1c03] (rev a1)
    Subsystem: eVga.com. Corp. Device [3842:6163]
    Kernel modules: nvidiafb, nouveau, nvidia_387_drm, nvidia_387
06:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f1] (rev a1)
    Subsystem: eVga.com. Corp. Device [3842:6163]
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel
07:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1c03] (rev a1)
    Subsystem: eVga.com. Corp. Device [3842:6163]
    Kernel modules: nvidiafb, nouveau, nvidia_387_drm, nvidia_387
07:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f1] (rev a1)
    Subsystem: eVga.com. Corp. Device [3842:6163]
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel

答案1

更新:解决方案是扰乱 BIOS 以打开所有 PCIe 通道以供使用以及使用“Above 4G Decoding Enabled”

答案2

您需要摆脱“nouveau”模块/驱动程序。

假设您使用的是 gnome,打开“软件和更新”->“其他驱动程序”并将其更改为“NVIDIA 二进制驱动程序”之一

如果这对您不起作用,唯一的选择是通过此处记录的黑名单方法。

http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html

相关内容