无法在搭载 GTX 1080ti 和 K80 的 Ubuntu 20.04 中启动 X 服务

无法在搭载 GTX 1080ti 和 K80 的 Ubuntu 20.04 中启动 X 服务

目前,我正在运行带有 gyx1080ti 和 Tesla K80 的双 GPU 设置,但无法启动 X 服务。我的计划是使用 1080ti 作为我的显示 GPU,使用 Tesla K80 进行 CUDA 计算。以下是结果的屏幕截图lspci

00:00.0 Host bridge: Intel Corporation Device 9b33 (rev 05)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) (rev 05)
00:14.0 USB controller: Intel Corporation Device 43ed (rev 11)
00:14.2 RAM memory: Intel Corporation Device 43ef (rev 11)
00:14.3 Network controller: Intel Corporation Device 43f0 (rev 11)
00:15.0 Serial bus controller [0c80]: Intel Corporation Device 43e8 (rev 11)
00:15.1 Serial bus controller [0c80]: Intel Corporation Device 43e9 (rev 11)
00:16.0 Communication controller: Intel Corporation Device 43e0 (rev 11)
00:17.0 SATA controller: Intel Corporation Device 43d2 (rev 11)
00:1b.0 PCI bridge: Intel Corporation Device 43c0 (rev 11)
00:1b.4 PCI bridge: Intel Corporation Device 43c4 (rev 11)
00:1c.0 PCI bridge: Intel Corporation Device 43b8 (rev 11)
00:1c.7 PCI bridge: Intel Corporation Device 43bf (rev 11)
00:1d.0 PCI bridge: Intel Corporation Device 43b0 (rev 11)
00:1f.0 ISA bridge: Intel Corporation Device 4385 (rev 11)
00:1f.3 Audio device: Intel Corporation Device f0c8 (rev 11)
00:1f.4 SMBus: Intel Corporation Device 43a3 (rev 11)
00:1f.5 Serial bus controller [0c80]: Intel Corporation Device 43a4 (rev 11)
01:00.0 PCI bridge: PLX Technology, Inc. PEX 8747 48-Lane, 5-Port PCI Express Gen 3 (8.0 GT/s) Switch (rev ff)
02:08.0 PCI bridge: PLX Technology, Inc. PEX 8747 48-Lane, 5-Port PCI Express Gen 3 (8.0 GT/s) Switch (rev ff)
02:10.0 PCI bridge: PLX Technology, Inc. PEX 8747 48-Lane, 5-Port PCI Express Gen 3 (8.0 GT/s) Switch (rev ff)
03:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev ff)
04:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev ff)
05:00.0 Non-Volatile memory controller: Sandisk Corp Device 5006
06:00.0 VGA compatible controller: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] (rev a1)
06:00.1 Audio device: NVIDIA Corporation GP102 HDMI Audio Controller (rev a1)
08:00.0 Ethernet controller: Intel Corporation Device 15f3 (rev 03)
09:00.0 Non-Volatile memory controller: Sandisk Corp Device 5011 (rev 01)

我目前担心的是我的 NVIDIA 驱动程序仅支持 GTX 系列设备。我只是想知道我是否可以只使用 GTX 卡上的显示驱动程序,从而启动 X 服务?

看来 NVIDIA 驱动程序运行正常,因为它nvidia-smi成功找到了两张卡。我还在输出中发现lspciK80 被识别为“3d 控制器”。也许这就是问题所在?

答案1

好的。我找到了解决我的问题的方法。简而言之,X 服务器无法启动的原因是 X 服务器的自动配置无法找到正确的显示设备,即 gpu。这可以通过在配置文件中手动设置设备来解决。特别是,

  1. sudo nvidia-xconfig这将生成一个配置文件/etc/X11/xorg.conf
  2. 接下来,输入命令lspci,并找到要连接显示器的 GPU 的 BusID。 在我的情况下是这样的:
06:00.0 VGA compatible controller: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] (rev a1)

BusID 是6:0:0

  1. 接下来,xorg.conf找到以下内容
Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BusID          "PCI:6:0:0" # make sure the ID matches the VGA device ID
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

并确保部分BusId中的Device与您的 gpu BusID 的结果相匹配,如步骤 2 所示。

  1. 重启

现在您可以出发了 ;)。

相关内容