我使用 Ubuntu 构建了一台多 GPU PC 机用于个人深度学习研究,其特点是:
- 英特尔 I9-9900K CPU(LGA1151)
- 四个 Corsair Vengence RGB Pro 16MB RAM
- 华硕 WS Z390 PRO 主板
- 各两个,华硕 GeForce RTX 2080 Ti Turbo GPU
- SSD 1:三星 970 EVO M.2 1 TB
- SSD 2:HP EX950 M.2 1 TB(此处安装有 Windows 10)
- Deepcool 120mm 风扇 0.23A RF 120M x5
- 联力120mm风扇 x2
- Redux 80 Noctua NF-R8 redux 1800 PWM
- 联力 PC-011 航空箱
- EVGA SuperNOVA T2 1600 电源
- Masterliquid ML360R RGB CPU 冷却器
第二块 GPU 来得晚,所以我只安装了一块 GPU 的 Ubuntu 18.04;设置了 NVIDIA 驱动程序、CUDA 和 Tensorflow(带 GPU),就可以正常工作了。
安装第二个 GPU 时,请尝试启动 Ubuntu;我收到桌面的登录提示,输入密码,但系统只是挂起,显示紫色背景,并且光标(无法再移动)。
我可以在没有 GUI 的安全模式下启动 Ubuntu,并可以在那里运行命令。
nvidia-smi显示两个 GPU。Windows 10 中一切正常,我可以正常看到第二个 GPU。
有人知道问题可能是什么吗?顺便说一句,我是 Ubuntu 新手。
如何排除故障以启用 Ubuntu 使用两个 GPU 启动?
/var/log/Xorg.0.log的内容:
[ 23.741] (--) Log file renamed from "/var/log/Xorg.pid-1764.log" to "/var/log/Xorg.0.log"
[ 23.741]
X.Org X Server 1.20.4
X Protocol Version 11, Revision 0
[ 23.741] Build Operating System: Linux 4.4.0-148-generic x86_64 Ubuntu
[ 23.741] Current Operating System: Linux anduril 5.0.0-29-generic #31~18.04.1-Ubuntu SMP Thu Sep 12 18:29:21 UTC 2019 x86_64
[ 23.741] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.0.0-29-generic root=UUID=cd06c7e7-bb97-489c-bb2d-eedf4a02076b ro quiet splash resume=UUID=f2d52073-433f-4803-bcd7-4e8e85b3a6d3 vt.handoff=1
[ 23.741] Build Date: 02 May 2019 08:06:54AM
[ 23.741] xorg-server-hwe-18.04 2:1.20.4-1ubuntu3~18.04.1 (For technical support please see http://www.ubuntu.com/support)
[ 23.741] Current version of pixman: 0.34.0
[ 23.741] Before reporting problems, check http://wiki.x.org
to make sure that you have the latest version.
[ 23.741] Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[ 23.741] (==) Log file: "/var/log/Xorg.0.log", Time: Mon Sep 30 01:22:05 2019
[ 23.741] (==) Using config file: "/etc/X11/xorg.conf"
[ 23.741] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
[ 23.742] (==) ServerLayout "Layout0"
[ 23.742] (**) |-->Screen "Screen0" (0)
[ 23.742] (**) | |-->Monitor "Monitor0"
[ 23.743] (**) | |-->Device "Device0"
[ 23.743] (**) |-->Input Device "Keyboard0"
[ 23.743] (**) |-->Input Device "Mouse0"
[ 23.743] (==) Automatically adding devices
[ 23.743] (==) Automatically enabling devices
[ 23.743] (==) Automatically adding GPU devices
[ 23.743] (==) Automatically binding GPU devices
[ 23.743] (==) Max clients allowed: 256, resource mask: 0x1fffff
[ 23.743] (WW) The directory "/usr/share/fonts/X11/cyrillic" does not exist.
[ 23.743] Entry deleted from font path.
[ 23.743] (WW) The directory "/usr/share/fonts/X11/100dpi/" does not exist.
[ 23.743] Entry deleted from font path.
[ 23.743] (WW) The directory "/usr/share/fonts/X11/75dpi/" does not exist.
[ 23.743] Entry deleted from font path.
[ 23.743] (WW) The directory "/usr/share/fonts/X11/100dpi" does not exist.
[ 23.743] Entry deleted from font path.
[ 23.743] (WW) The directory "/usr/share/fonts/X11/75dpi" does not exist.
[ 23.743] Entry deleted from font path.
[ 23.743] (==) FontPath set to:
/usr/share/fonts/X11/misc,
/usr/share/fonts/X11/Type1,
built-ins
[ 23.743] (==) ModulePath set to "/usr/lib/xorg/modules"
[ 23.743] (WW) Hotplugging is on, devices using drivers 'kbd', 'mouse' or 'vmmouse' will be disabled.
[ 23.743] (WW) Disabling Keyboard0
[ 23.743] (WW) Disabling Mouse0
[ 23.743] (II) Loader magic: 0x56118df71020
[ 23.743] (II) Module ABI versions:
[ 23.743] X.Org ANSI C Emulation: 0.4
[ 23.743] X.Org Video Driver: 24.0
[ 23.743] X.Org XInput driver : 24.1
[ 23.743] X.Org Server Extension : 10.0
[ 23.743] (++) using VT number 2
[ 23.744] (II) systemd-logind: took control of session /org/freedesktop/login1/session/_32
[ 23.744] (II) xfree86: Adding drm device (/dev/dri/card1)
[ 23.745] (II) systemd-logind: got fd for /dev/dri/card1 226:1 fd 12 paused 0
[ 23.745] (II) xfree86: Adding drm device (/dev/dri/card2)
[ 23.745] (II) systemd-logind: got fd for /dev/dri/card2 226:2 fd 13 paused 0
[ 23.745] (II) xfree86: Adding drm device (/dev/dri/card0)
[ 23.745] (II) systemd-logind: got fd for /dev/dri/card0 226:0 fd 14 paused 0
[ 23.746] (**) OutputClass "nvidia" ModulePath extended to "/usr/lib/x86_64-linux-gnu/nvidia-430/xorg,/usr/lib/xorg/modules"
[ 23.746] (**) OutputClass "Nvidia Prime" ModulePath extended to "/x86_64-linux-gnu/nvidia/xorg,/usr/lib/x86_64-linux-gnu/nvidia-430/xorg,/usr/lib/xorg/modules"
[ 23.746] (**) OutputClass "nvidia" ModulePath extended to "/usr/lib/x86_64-linux-gnu/nvidia-430/xorg,/x86_64-linux-gnu/nvidia/xorg,/usr/lib/x86_64-linux-gnu/nvidia-430/xorg,/usr/lib/xorg/modules"
[ 23.746] (**) OutputClass "Nvidia Prime" ModulePath extended to "/x86_64-linux-gnu/nvidia/xorg,/usr/lib/x86_64-linux-gnu/nvidia-430/xorg,/x86_64-linux-gnu/nvidia/xorg,/usr/lib/x86_64-linux-gnu/nvidia-430/xorg,/usr/lib/xorg/modules"
[ 23.746] (**) OutputClass "Nvidia Prime" setting /dev/dri/card1 as PrimaryGPU
[ 23.747] (--) PCI: (0@0:2:0) 8086:3e98:1043:8694 rev 2, Mem @ 0xb3000000/16777216, 0x70000000/268435456, I/O @ 0x00006000/64, BIOS @ 0x????????/131072
[ 23.747] (--) PCI: (3@0:0:0) 10de:1e04:1043:8675 rev 161, Mem @ 0xb6000000/16777216, 0xa0000000/268435456, 0xb0000000/33554432, I/O @ 0x00004000/128, BIOS @ 0x????????/524288
[ 23.747] (--) PCI:*(4@0:0:0) 10de:1e04:1043:8675 rev 161, Mem @ 0xb4000000/16777216, 0x80000000/268435456, 0x90000000/33554432, I/O @ 0x00003000/128, BIOS @ 0x????????/524288
[ 23.747] (II) LoadModule: "glx"
[ 23.748] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so
[ 23.749] (II) Module glx: vendor="X.Org Foundation"
[ 23.749] compiled for 1.20.4, module version = 1.0.0
[ 23.749] ABI class: X.Org Server Extension, version 10.0
[ 23.749] (II) LoadModule: "nvidia"
[ 23.749] (II) Loading /usr/lib/x86_64-linux-gnu/nvidia-430/xorg/nvidia_drv.so
[ 23.752] (II) Module nvidia: vendor="NVIDIA Corporation"
[ 23.752] compiled for 1.6.99.901, module version = 1.0.0
[ 23.752] Module class: X.Org Video Driver
[ 23.752] (II) NVIDIA dlloader X Driver 430.26 Tue Jun 4 17:52:10 CDT 2019
[ 23.752] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[ 23.752] (II) systemd-logind: releasing fd for 226:1
[ 23.753] (II) Loading sub module "fb"
[ 23.753] (II) LoadModule: "fb"
[ 23.753] (II) Loading /usr/lib/xorg/modules/libfb.so
[ 23.753] (II) Module fb: vendor="X.Org Foundation"
[ 23.753] compiled for 1.20.4, module version = 1.0.0
[ 23.753] ABI class: X.Org ANSI C Emulation, version 0.4
[ 23.753] (II) Loading sub module "wfb"
[ 23.753] (II) LoadModule: "wfb"
[ 23.753] (II) Loading /usr/lib/xorg/modules/libwfb.so
[ 23.754] (II) Module wfb: vendor="X.Org Foundation"
[ 23.754] compiled for 1.20.4, module version = 1.0.0
[ 23.754] ABI class: X.Org ANSI C Emulation, version 0.4
[ 23.754] (II) Loading sub module "ramdac"
[ 23.754] (II) LoadModule: "ramdac"
[ 23.754] (II) Module "ramdac" already built-in
[ 23.755] (II) systemd-logind: releasing fd for 226:2
[ 23.755] (**) NVIDIA(0): Depth 24, (--) framebuffer bpp 32
[ 23.755] (==) NVIDIA(0): RGB weight 888
[ 23.755] (==) NVIDIA(0): Default visual is TrueColor
[ 23.755] (==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
[ 23.755] (II) Applying OutputClass "nvidia" options to /dev/dri/card1
[ 23.755] (II) Applying OutputClass "Nvidia Prime" options to /dev/dri/card1
[ 23.755] (**) NVIDIA(0): Option "AllowEmptyInitialConfiguration"
[ 23.755] (**) NVIDIA(0): Option "IgnoreDisplayDevices" "CRT"
[ 23.755] (**) NVIDIA(0): Enabling 2D acceleration
[ 23.755] (II) Loading sub module "glxserver_nvidia"
[ 23.755] (II) LoadModule: "glxserver_nvidia"
[ 23.755] (II) Loading /usr/lib/x86_64-linux-gnu/nvidia-430/xorg/libglxserver_nvidia.so
[ 23.771] (II) Module glxserver_nvidia: vendor="NVIDIA Corporation"
[ 23.771] compiled for 1.6.99.901, module version = 1.0.0
[ 23.771] Module class: X.Org Server Extension
[ 23.771] (II) NVIDIA GLX Module 430.26 Tue Jun 4 17:50:01 CDT 2019
[ 23.804] (--) NVIDIA(0): Valid display device(s) on GPU-0 at PCI:4:0:0
[ 23.804] (--) NVIDIA(0): DFP-0
[ 23.804] (--) NVIDIA(0): DFP-1
[ 23.804] (--) NVIDIA(0): DFP-2
[ 23.804] (--) NVIDIA(0): DFP-3
[ 23.804] (--) NVIDIA(0): DFP-4
[ 23.804] (--) NVIDIA(0): DFP-5
[ 23.806] (II) NVIDIA(0): NVIDIA GPU GeForce RTX 2080 Ti (TU102-A) at PCI:4:0:0 (GPU-0)
[ 23.806] (--) NVIDIA(0): Memory: 11534336 kBytes
[ 23.806] (--) NVIDIA(0): VideoBIOS: 90.02.17.00.b2
[ 23.806] (II) NVIDIA(0): Detected PCI Express Link width: 16X
[ 23.806] (--) NVIDIA(GPU-0): DFP-0: disconnected
[ 23.806] (--) NVIDIA(GPU-0): DFP-0: Internal TMDS
[ 23.806] (--) NVIDIA(GPU-0): DFP-0: 165.0 MHz maximum pixel clock
[ 23.806] (--) NVIDIA(GPU-0):
[ 23.806] (--) NVIDIA(GPU-0): DFP-1: disconnected
[ 23.806] (--) NVIDIA(GPU-0): DFP-1: Internal DisplayPort
[ 23.806] (--) NVIDIA(GPU-0): DFP-1: 2660.0 MHz maximum pixel clock
[ 23.806] (--) NVIDIA(GPU-0):
[ 23.806] (--) NVIDIA(GPU-0): DFP-2: disconnected
[ 23.806] (--) NVIDIA(GPU-0): DFP-2: Internal TMDS
[ 23.806] (--) NVIDIA(GPU-0): DFP-2: 165.0 MHz maximum pixel clock
[ 23.806] (--) NVIDIA(GPU-0):
[ 23.806] (--) NVIDIA(GPU-0): DFP-3: disconnected
[ 23.806] (--) NVIDIA(GPU-0): DFP-3: Internal DisplayPort
[ 23.806] (--) NVIDIA(GPU-0): DFP-3: 2660.0 MHz maximum pixel clock
[ 23.806] (--) NVIDIA(GPU-0):
[ 23.806] (--) NVIDIA(GPU-0): DFP-4: disconnected
[ 23.806] (--) NVIDIA(GPU-0): DFP-4: Internal TMDS
[ 23.806] (--) NVIDIA(GPU-0): DFP-4: 165.0 MHz maximum pixel clock
[ 23.806] (--) NVIDIA(GPU-0):
[ 23.806] (--) NVIDIA(GPU-0): DFP-5: disconnected
[ 23.806] (--) NVIDIA(GPU-0): DFP-5: Internal DisplayPort
[ 23.806] (--) NVIDIA(GPU-0): DFP-5: 2660.0 MHz maximum pixel clock
[ 23.806] (--) NVIDIA(GPU-0):
[ 23.806] (==) NVIDIA(0):
[ 23.806] (==) NVIDIA(0): No modes were requested; the default mode "nvidia-auto-select"
[ 23.806] (==) NVIDIA(0): will be used as the requested mode.
[ 23.806] (==) NVIDIA(0):
[ 23.806] (--) NVIDIA(0): No enabled display devices found; starting anyway because
[ 23.806] (--) NVIDIA(0): AllowEmptyInitialConfiguration is enabled
[ 23.807] (II) NVIDIA(0): Validated MetaModes:
[ 23.807] (II) NVIDIA(0): "NULL"
[ 23.807] (II) NVIDIA(0): Virtual screen size determined to be 640 x 480
[ 23.807] (WW) NVIDIA(0): Unable to get display device for DPI computation.
[ 23.807] (==) NVIDIA(0): DPI set to (75, 75); computed from built-in default
[ 23.834] (--) NVIDIA(0): Valid display device(s) on GPU-1 at PCI:3:0:0
[ 23.834] (--) NVIDIA(0): DFP-0
[ 23.834] (--) NVIDIA(0): DFP-1
[ 23.834] (--) NVIDIA(0): DFP-2
[ 23.834] (--) NVIDIA(0): DFP-3
[ 23.834] (--) NVIDIA(0): DFP-4
[ 23.834] (--) NVIDIA(0): DFP-5
[ 23.835] (--) NVIDIA(GPU-1): DFP-0: disconnected
[ 23.835] (--) NVIDIA(GPU-1): DFP-0: Internal TMDS
[ 23.835] (--) NVIDIA(GPU-1): DFP-0: 165.0 MHz maximum pixel clock
[ 23.835] (--) NVIDIA(GPU-1):
[ 23.835] (--) NVIDIA(GPU-1): DFP-1: disconnected
[ 23.835] (--) NVIDIA(GPU-1): DFP-1: Internal DisplayPort
[ 23.835] (--) NVIDIA(GPU-1): DFP-1: 2660.0 MHz maximum pixel clock
[ 23.835] (--) NVIDIA(GPU-1):
[ 23.835] (--) NVIDIA(GPU-1): DFP-2: disconnected
[ 23.835] (--) NVIDIA(GPU-1): DFP-2: Internal TMDS
[ 23.835] (--) NVIDIA(GPU-1): DFP-2: 165.0 MHz maximum pixel clock
[ 23.835] (--) NVIDIA(GPU-1):
[ 23.835] (--) NVIDIA(GPU-1): DFP-3: disconnected
[ 23.835] (--) NVIDIA(GPU-1): DFP-3: Internal DisplayPort
[ 23.835] (--) NVIDIA(GPU-1): DFP-3: 2660.0 MHz maximum pixel clock
[ 23.835] (--) NVIDIA(GPU-1):
[ 23.835] (--) NVIDIA(GPU-1): DFP-4: disconnected
[ 23.835] (--) NVIDIA(GPU-1): DFP-4: Internal TMDS
[ 23.835] (--) NVIDIA(GPU-1): DFP-4: 165.0 MHz maximum pixel clock
[ 23.835] (--) NVIDIA(GPU-1):
[ 23.835] (--) NVIDIA(GPU-1): DFP-5: disconnected
[ 23.835] (--) NVIDIA(GPU-1): DFP-5: Internal DisplayPort
[ 23.835] (--) NVIDIA(GPU-1): DFP-5: 2660.0 MHz maximum pixel clock
[ 23.835] (--) NVIDIA(GPU-1):
[ 23.878] (II) NVIDIA(GPU-1): NVIDIA GPU GeForce RTX 2080 Ti (TU102-A) at PCI:3:0:0 (GPU-1)
[ 23.878] (--) NVIDIA(GPU-1): Memory: 11534336 kBytes
[ 23.878] (--) NVIDIA(GPU-1): VideoBIOS: 90.02.17.00.b2
[ 23.878] (II) NVIDIA(GPU-1): Detected PCI Express Link width: 16X
[ 23.879] (II) NVIDIA: Using 24576.00 MB of virtual memory for indirect memory
[ 23.879] (II) NVIDIA: access.
[ 23.897] (II) NVIDIA(0): Setting mode "NULL"
[ 23.912] (==) NVIDIA(0): Disabling shared memory pixmaps
[ 23.912] (==) NVIDIA(0): Backing store enabled
[ 23.912] (==) NVIDIA(0): Silken mouse enabled
[ 23.913] (**) NVIDIA(0): DPMS enabled
[ 23.913] (WW) NVIDIA(0): Option "PrimaryGPU" is not used
[ 23.913] (II) Loading sub module "dri2"
[ 23.913] (II) LoadModule: "dri2"
[ 23.913] (II) Module "dri2" already built-in
[ 23.913] (II) NVIDIA(0): [DRI2] Setup complete
[ 23.913] (II) NVIDIA(0): [DRI2] VDPAU driver: nvidia
[ 23.913] (II) Initializing extension Generic Event Extension
[ 23.913] (II) Initializing extension SHAPE
[ 23.913] (II) Initializing extension MIT-SHM
[ 23.913] (II) Initializing extension XInputExtension
[ 23.913] (II) Initializing extension XTEST
[ 23.913] (II) Initializing extension BIG-REQUESTS
[ 23.913] (II) Initializing extension SYNC
[ 23.913] (II) Initializing extension XKEYBOARD
[ 23.914] (II) Initializing extension XC-MISC
[ 23.914] (II) Initializing extension SECURITY
[ 23.914] (II) Initializing extension XFIXES
[ 23.914] (II) Initializing extension RENDER
[ 23.914] (II) Initializing extension RANDR
[ 23.914] (II) Initializing extension COMPOSITE
[ 23.914] (II) Initializing extension DAMAGE
[ 23.914] (II) Initializing extension MIT-SCREEN-SAVER
[ 23.914] (II) Initializing extension DOUBLE-BUFFER
[ 23.914] (II) Initializing extension RECORD
[ 23.914] (II) Initializing extension DPMS
[ 23.914] (II) Initializing extension Present
[ 23.914] (II) Initializing extension DRI3
[ 23.914] (II) Initializing extension X-Resource
[ 23.914] (II) Initializing extension XVideo
[ 23.914] (II) Initializing extension XVideo-MotionCompensation
[ 23.914] (II) Initializing extension SELinux
[ 23.914] (II) SELinux: Disabled on system
[ 23.914] (II) Initializing extension GLX
[ 23.914] (II) Initializing extension GLX
[ 23.914] (II) Indirect GLX disabled.
[ 23.914] (II) GLX: Another vendor is already registered for screen 0
[ 23.914] (II) Initializing extension XFree86-VidModeExtension
[ 23.914] (II) Initializing extension XFree86-DGA
[ 23.914] (II) Initializing extension XFree86-DRI
[ 23.915] (II) Initializing extension DRI2
[ 23.915] (II) Initializing extension NV-GLX
[ 23.915] (II) Initializing extension NV-CONTROL
[ 23.940] (II) config/udev: Adding input device Power Button (/dev/input/event2)
[ 23.940] (**) Power Button: Applying InputClass "libinput keyboard catchall"
[ 23.940] (II) LoadModule: "libinput"
[ 23.940] (II) Loading /usr/lib/xorg/modules/input/libinput_drv.so
[ 23.941] (II) Module libinput: vendor="X.Org Foundation"
[ 23.941] compiled for 1.20.1, module version = 0.28.1
[ 23.941] Module class: X.Org XInput Driver
[ 23.941] ABI class: X.Org XInput driver, version 24.1
[ 23.941] (II) Using input driver 'libinput' for 'Power Button'
[ 23.941] (II) systemd-logind: got fd for /dev/input/event2 13:66 fd 46 paused 0
[ 23.941] (**) Power Button: always reports core events
[ 23.941] (**) Option "Device" "/dev/input/event2"
[ 23.941] (**) Option "_source" "server/udev"
[ 23.941] (II) event2 - Power Button: is tagged by udev as: Keyboard
[ 23.941] (II) event2 - Power Button: device is a keyboard
[ 23.941] (II) event2 - Power Button: device removed
[ 23.941] (**) Option "config_info" "udev:/sys/devices/LNXSYSTM:00/LNXPWRBN:00/input/input2/event2"
[ 23.941] (II) XINPUT: Adding extended input device "Power Button" (type: KEYBOARD, id 6)
[ 23.941] (**) Option "xkb_model" "pc105"
[ 23.941] (**) Option "xkb_layout" "us"
[ 23.941] (II) event2 - Power Button: is tagged by udev as: Keyboard
[ 23.941] (II) event2 - Power Button: device is a keyboard
[ 23.942] (II) config/udev: Adding input device Video Bus (/dev/input/event8)
[ 23.942] (**) Video Bus: Applying InputClass "libinput keyboard catchall"
[ 23.942] (II) Using input driver 'libinput' for 'Video Bus'
[ 23.942] (II) systemd-logind: got fd for /dev/input/event8 13:72 fd 49 paused 0
[ 23.942] (**) Video Bus: always reports core events
[ 23.942] (**) Option "Device" "/dev/input/event8"
[ 23.942] (**) Option "_source" "server/udev"
[ 23.942] (II) event8 - Video Bus: is tagged by udev as: Keyboard
[ 23.942] (II) event8 - Video Bus: device is a keyboard
[ 23.942] (II) event8 - Video Bus: device removed
[ 23.942] (**) Option "config_info" "udev:/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input8/event8"
[ 23.942] (II) XINPUT: Adding extended input device "Video Bus" (type: KEYBOARD, id 7)
[ 23.942] (**) Option "xkb_model" "pc105"
[ 23.942] (**) Option "xkb_layout" "us"
[ 23.942] (II) event8 - Video Bus: is tagged by udev as: Keyboard
[ 23.942] (II) event8 - Video Bus: device is a keyboard
[ 23.943] (II) config/udev: Adding input device Power Button (/dev/input/event1)
[ 23.943] (**) Power Button: Applying InputClass "libinput keyboard catchall"
[ 23.943] (II) Using input driver 'libinput' for 'Power Button'
[ 23.943] (II) systemd-logind: got fd for /dev/input/event1 13:65 fd 50 paused 0
[ 23.943] (**) Power Button: always reports core events
[ 23.943] (**) Option "Device" "/dev/input/event1"
[ 23.943] (**) Option "_source" "server/udev"
[ 23.943] (II) event1 - Power Button: is tagged by udev as: Keyboard
[ 23.943] (II) event1 - Power Button: device is a keyboard
[ 23.943] (II) event1 - Power Button: device removed
[ 23.943] (**) Option "config_info" "udev:/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input1/event1"
[ 23.943] (II) XINPUT: Adding extended input device "Power Button" (type: KEYBOARD, id 8)
[ 23.943] (**) Option "xkb_model" "pc105"
[ 23.943] (**) Option "xkb_layout" "us"
[ 23.943] (II) event1 - Power Button: is tagged by udev as: Keyboard
[ 23.943] (II) event1 - Power Button: device is a keyboard
[ 23.943] (II) config/udev: Adding input device Sleep Button (/dev/input/event0)
[ 23.943] (**) Sleep Button: Applying InputClass "libinput keyboard catchall"
[ 23.943] (II) Using input driver 'libinput' for 'Sleep Button'
[ 23.944] (II) systemd-logind: got fd for /dev/input/event0 13:64 fd 51 paused 0
[ 23.944] (**) Sleep Button: always reports core events
[ 23.944] (**) Option "Device" "/dev/input/event0"
[ 23.944] (**) Option "_source" "server/udev"
[ 23.944] (II) event0 - Sleep Button: is tagged by udev as: Keyboard
[ 23.944] (II) event0 - Sleep Button: device is a keyboard
[ 23.944] (II) event0 - Sleep Button: device removed
[ 23.944] (**) Option "config_info" "udev:/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0E:00/input/input0/event0"
[ 23.944] (II) XINPUT: Adding extended input device "Sleep Button" (type: KEYBOARD, id 9)
[ 23.944] (**) Option "xkb_model" "pc105"
[ 23.944] (**) Option "xkb_layout" "us"
[ 23.944] (II) event0 - Sleep Button: is tagged by udev as: Keyboard
[ 23.944] (II) event0 - Sleep Button: device is a keyboard
[ 23.944] (II) config/udev: Adding input device HDA NVidia HDMI/DP,pcm=3 (/dev/input/event21)
[ 23.944] (II) No input driver specified, ignoring this device.
[ 23.944] (II) This device may have been added with another device file.
[ 23.944] (II) config/udev: Adding input device HDA NVidia HDMI/DP,pcm=7 (/dev/input/event22)
[ 23.944] (II) No input driver specified, ignoring this device.
[ 23.944] (II) This device may have been added with another device file.
[ 23.945] (II) config/udev: Adding input device HDA NVidia HDMI/DP,pcm=8 (/dev/input/event23)
[ 23.945] (II) No input driver specified, ignoring this device.
[ 23.945] (II) This device may have been added with another device file.
[ 23.945] (II) config/udev: Adding input device HDA NVidia HDMI/DP,pcm=9 (/dev/input/event24)
[ 23.945] (II) No input driver specified, ignoring this device.
[ 23.945] (II) This device may have been added with another device file.
[ 23.945] (II) config/udev: Adding input device HDA NVidia HDMI/DP,pcm=3 (/dev/input/event25)
[ 23.945] (II) No input driver specified, ignoring this device.
[ 23.945] (II) This device may have been added with another device file.
[ 23.945] (II) config/udev: Adding input device HDA NVidia HDMI/DP,pcm=7 (/dev/input/event26)
[ 23.945] (II) No input driver specified, ignoring this device.
[ 23.945] (II) This device may have been added with another device file.
[ 23.945] (II) config/udev: Adding input device HDA NVidia HDMI/DP,pcm=8 (/dev/input/event27)
[ 23.945] (II) No input driver specified, ignoring this device.
[ 23.945] (II) This device may have been added with another device file.
[ 23.945] (II) config/udev: Adding input device HDA NVidia HDMI/DP,pcm=9 (/dev/input/event28)
[ 23.945] (II) No input driver specified, ignoring this device.
[ 23.945] (II) This device may have been added with another device file.
... had to trim to fit under the 30000 char forum post limit
[ 23.954] (**) Option "xkb_model" "pc105"
[ 23.954] (**) Option "xkb_layout" "us"
[ 23.963] (--) NVIDIA(GPU-0): DFP-0: disconnected
[ 23.963] (--) NVIDIA(GPU-0): DFP-0: Internal TMDS
[ 23.963] (--) NVIDIA(GPU-0): DFP-0: 165.0 MHz maximum pixel clock
[ 23.963] (--) NVIDIA(GPU-0):
[ 23.963] (--) NVIDIA(GPU-0): DFP-1: disconnected
[ 23.963] (--) NVIDIA(GPU-0): DFP-1: Internal DisplayPort
[ 23.963] (--) NVIDIA(GPU-0): DFP-1: 2660.0 MHz maximum pixel clock
[ 23.963] (--) NVIDIA(GPU-0):
[ 23.963] (--) NVIDIA(GPU-0): DFP-2: disconnected
[ 23.963] (--) NVIDIA(GPU-0): DFP-2: Internal TMDS
[ 23.963] (--) NVIDIA(GPU-0): DFP-2: 165.0 MHz maximum pixel clock
[ 23.963] (--) NVIDIA(GPU-0):
[ 23.963] (--) NVIDIA(GPU-0): DFP-3: disconnected
[ 23.963] (--) NVIDIA(GPU-0): DFP-3: Internal DisplayPort
[ 23.963] (--) NVIDIA(GPU-0): DFP-3: 2660.0 MHz maximum pixel clock
[ 23.963] (--) NVIDIA(GPU-0):
[ 23.963] (--) NVIDIA(GPU-0): DFP-4: disconnected
[ 23.963] (--) NVIDIA(GPU-0): DFP-4: Internal TMDS
[ 23.963] (--) NVIDIA(GPU-0): DFP-4: 165.0 MHz maximum pixel clock
[ 23.963] (--) NVIDIA(GPU-0):
[ 23.963] (--) NVIDIA(GPU-0): DFP-5: disconnected
[ 23.963] (--) NVIDIA(GPU-0): DFP-5: Internal DisplayPort
[ 23.963] (--) NVIDIA(GPU-0): DFP-5: 2660.0 MHz maximum pixel clock
[ 23.963] (--) NVIDIA(GPU-0):
[ 23.965] (--) NVIDIA(GPU-0): DFP-0: disconnected
[ 23.965] (--) NVIDIA(GPU-0): DFP-0: Internal TMDS
[ 23.966] (--) NVIDIA(GPU-0): DFP-0: 165.0 MHz maximum pixel clock
[ 23.966] (--) NVIDIA(GPU-0):
[ 23.966] (--) NVIDIA(GPU-0): DFP-1: disconnected
[ 23.966] (--) NVIDIA(GPU-0): DFP-1: Internal DisplayPort
[ 23.966] (--) NVIDIA(GPU-0): DFP-1: 2660.0 MHz maximum pixel clock
[ 23.966] (--) NVIDIA(GPU-0):
[ 23.966] (--) NVIDIA(GPU-0): DFP-2: disconnected
[ 23.966] (--) NVIDIA(GPU-0): DFP-2: Internal TMDS
[ 23.966] (--) NVIDIA(GPU-0): DFP-2: 165.0 MHz maximum pixel clock
[ 23.966] (--) NVIDIA(GPU-0):
[ 23.966] (--) NVIDIA(GPU-0): DFP-3: disconnected
[ 23.966] (--) NVIDIA(GPU-0): DFP-3: Internal DisplayPort
[ 23.966] (--) NVIDIA(GPU-0): DFP-3: 2660.0 MHz maximum pixel clock
[ 23.966] (--) NVIDIA(GPU-0):
[ 23.966] (--) NVIDIA(GPU-0): DFP-4: disconnected
[ 23.966] (--) NVIDIA(GPU-0): DFP-4: Internal TMDS
[ 23.966] (--) NVIDIA(GPU-0): DFP-4: 165.0 MHz maximum pixel clock
[ 23.966] (--) NVIDIA(GPU-0):
[ 23.966] (--) NVIDIA(GPU-0): DFP-5: disconnected
[ 23.966] (--) NVIDIA(GPU-0): DFP-5: Internal DisplayPort
[ 23.966] (--) NVIDIA(GPU-0): DFP-5: 2660.0 MHz maximum pixel clock
[ 23.966] (--) NVIDIA(GPU-0):
[ 23.966] (--) NVIDIA(GPU-0): DFP-0: disconnected
[ 23.966] (--) NVIDIA(GPU-0): DFP-0: Internal TMDS
[ 23.966] (--) NVIDIA(GPU-0): DFP-0: 165.0 MHz maximum pixel clock
[ 23.966] (--) NVIDIA(GPU-0):
[ 23.966] (--) NVIDIA(GPU-0): DFP-1: disconnected
[ 23.966] (--) NVIDIA(GPU-0): DFP-1: Internal DisplayPort
[ 23.966] (--) NVIDIA(GPU-0): DFP-1: 2660.0 MHz maximum pixel clock
[ 23.966] (--) NVIDIA(GPU-0):
[ 23.966] (--) NVIDIA(GPU-0): DFP-2: disconnected
[ 23.966] (--) NVIDIA(GPU-0): DFP-2: Internal TMDS
[ 23.966] (--) NVIDIA(GPU-0): DFP-2: 165.0 MHz maximum pixel clock
[ 23.966] (--) NVIDIA(GPU-0):
[ 23.966] (--) NVIDIA(GPU-0): DFP-3: disconnected
[ 23.966] (--) NVIDIA(GPU-0): DFP-3: Internal DisplayPort
[ 23.966] (--) NVIDIA(GPU-0): DFP-3: 2660.0 MHz maximum pixel clock
[ 23.966] (--) NVIDIA(GPU-0):
[ 23.966] (--) NVIDIA(GPU-0): DFP-4: disconnected
[ 23.966] (--) NVIDIA(GPU-0): DFP-4: Internal TMDS
[ 23.966] (--) NVIDIA(GPU-0): DFP-4: 165.0 MHz maximum pixel clock
[ 23.966] (--) NVIDIA(GPU-0):
[ 23.966] (--) NVIDIA(GPU-0): DFP-5: disconnected
[ 23.966] (--) NVIDIA(GPU-0): DFP-5: Internal DisplayPort
[ 23.966] (--) NVIDIA(GPU-0): DFP-5: 2660.0 MHz maximum pixel clock
[ 23.966] (--) NVIDIA(GPU-0):
[ 23.973] (--) NVIDIA(GPU-0): DFP-0: disconnected
[ 23.973] (--) NVIDIA(GPU-0): DFP-0: Internal TMDS
[ 23.973] (--) NVIDIA(GPU-0): DFP-0: 165.0 MHz maximum pixel clock
[ 23.973] (--) NVIDIA(GPU-0):
[ 23.973] (--) NVIDIA(GPU-0): DFP-1: disconnected
[ 23.973] (--) NVIDIA(GPU-0): DFP-1: Internal DisplayPort
[ 23.973] (--) NVIDIA(GPU-0): DFP-1: 2660.0 MHz maximum pixel clock
[ 23.973] (--) NVIDIA(GPU-0):
[ 23.973] (--) NVIDIA(GPU-0): DFP-2: disconnected
[ 23.973] (--) NVIDIA(GPU-0): DFP-2: Internal TMDS
[ 23.973] (--) NVIDIA(GPU-0): DFP-2: 165.0 MHz maximum pixel clock
[ 23.973] (--) NVIDIA(GPU-0):
[ 23.973] (--) NVIDIA(GPU-0): DFP-3: disconnected
[ 23.973] (--) NVIDIA(GPU-0): DFP-3: Internal DisplayPort
[ 23.973] (--) NVIDIA(GPU-0): DFP-3: 2660.0 MHz maximum pixel clock
[ 23.973] (--) NVIDIA(GPU-0):
[ 23.973] (--) NVIDIA(GPU-0): DFP-4: disconnected
[ 23.973] (--) NVIDIA(GPU-0): DFP-4: Internal TMDS
[ 23.973] (--) NVIDIA(GPU-0): DFP-4: 165.0 MHz maximum pixel clock
[ 23.973] (--) NVIDIA(GPU-0):
[ 23.973] (--) NVIDIA(GPU-0): DFP-5: disconnected
[ 23.973] (--) NVIDIA(GPU-0): DFP-5: Internal DisplayPort
[ 23.973] (--) NVIDIA(GPU-0): DFP-5: 2660.0 MHz maximum pixel clock
[ 23.973] (--) NVIDIA(GPU-0):
[ 24.350] (--) NVIDIA(GPU-0): DFP-0: disconnected
[ 24.350] (--) NVIDIA(GPU-0): DFP-0: Internal TMDS
[ 24.350] (--) NVIDIA(GPU-0): DFP-0: 165.0 MHz maximum pixel clock
[ 24.350] (--) NVIDIA(GPU-0):
[ 24.350] (--) NVIDIA(GPU-0): DFP-1: disconnected
[ 24.350] (--) NVIDIA(GPU-0): DFP-1: Internal DisplayPort
[ 24.350] (--) NVIDIA(GPU-0): DFP-1: 2660.0 MHz maximum pixel clock
[ 24.350] (--) NVIDIA(GPU-0):
[ 24.350] (--) NVIDIA(GPU-0): DFP-2: disconnected
[ 24.350] (--) NVIDIA(GPU-0): DFP-2: Internal TMDS
[ 24.350] (--) NVIDIA(GPU-0): DFP-2: 165.0 MHz maximum pixel clock
[ 24.350] (--) NVIDIA(GPU-0):
[ 24.350] (--) NVIDIA(GPU-0): DFP-3: disconnected
[ 24.350] (--) NVIDIA(GPU-0): DFP-3: Internal DisplayPort
[ 24.350] (--) NVIDIA(GPU-0): DFP-3: 2660.0 MHz maximum pixel clock
[ 24.350] (--) NVIDIA(GPU-0):
[ 24.350] (--) NVIDIA(GPU-0): DFP-4: disconnected
[ 24.350] (--) NVIDIA(GPU-0): DFP-4: Internal TMDS
[ 24.350] (--) NVIDIA(GPU-0): DFP-4: 165.0 MHz maximum pixel clock
[ 24.350] (--) NVIDIA(GPU-0):
[ 24.350] (--) NVIDIA(GPU-0): DFP-5: disconnected
[ 24.350] (--) NVIDIA(GPU-0): DFP-5: Internal DisplayPort
[ 24.350] (--) NVIDIA(GPU-0): DFP-5: 2660.0 MHz maximum pixel clock
[ 24.350] (--) NVIDIA(GPU-0):
答案1
除非您发布内容,/var/log/Xorg.0.log
否则实际上没人能对您的问题做出太多的解答。提供的信息不足以诊断问题。
从总体描述来看,这是符合的。https://devtalk.nvidia.com/default/topic/1003017/how-do-i-set-one-gpu-for-display-and-the-other-two-gpus-for-cuda-computing-/
但该问题似乎尚未解决。
顺便说一句,你可以在 eBay 上以低于 400 美元的价格购买二手 Tesla M40 12GB GPU,这足以进行机器学习。此外,与 AMD 相比,当今的英特尔 CPU/主板组合性价比较低。尤其是在 IO 带宽方面。AM4 有 PCIe-4.0,具有惊人的 IO 带宽。
您可以以与 8/16 Intel CPU 相同的价格购买 16 核/32 线程 @ 4Ghz 的 threadripper 1950X。顺便说一句,这些 CPU 支持 OpenCL,但更重要的是,ML 往往是一个“横向扩展”问题,而不是“纵向扩展”。TR 工作站(或高端 AM4 工作站主板)主板将为您提供所需的 IO 扩展,以便随着需求的增长添加多个 GGPU。
它还可以为您购买更多的内存扩展空间,最大 2TiB。
您还可以完全放弃具有那么多核心和时钟的双启动情况,并通过 VFIO 使用专用显卡运行 Win10 VM,这将使您获得接近本机的性能。Level 1 Techs 对此做了一个很好的视频。https://www.youtube.com/watch?v=PLy1n7X2cAU
当您不使用虚拟机时,系统可以将其恢复计算。
答案2
重新生成 xorg.conf 似乎能解决问题,现在我的桌面恢复了。
所以:
sudo X-configure
sudo mv xorg.conf.new /etc/X11/xorg.conf
重启后,我的桌面恢复了,至少在连接到主板的 HDMI(使用英特尔显卡)时是这样的。之前登录 GUI 后桌面一片空白。
nvidia-smi 也可以看到两个 Nvidia gpu。