NVIDIA:X 服务器初始化期间发生 GPU 异常

NVIDIA:X 服务器初始化期间发生 GPU 异常

我的 GPU 是 GTX870M。我全新安装了 Ubuntu 18.04。我所做的就是:

sudo apt-get update
sudo apt-get upgrade
sudo ubuntu-drivers autoinstall
nvidia-xconfig
reboot

它安装了 nvidia-390 驱动程序。现在每当我尝试启动 X 服务器时,startx它都会失败。我仍然可以使用 Wayland。以下是我尝试的方法(在恢复模式下):

startx

输出:

X.Org X Server 1.20.1
X Protocol Version 11, Revision 0
Build Operating System: Linux 4.4.0-140-generic x86_64 Ubuntu
Current Operating System: Linux <censored>-PC 4.18.0-22-generic #23~18.04.1-Ubuntu SMP Thu Jun 6 08:37:25 UTC 2019 x86_64
Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.18.0-22-generic root=UUID=0d1d9304-4cd6-41f6-80b2-3562578a252e ro recovery nomodeset
Build Date: 27 November 2018  05:27:12PM
xorg-server-hwe-18.04 2:1.20.1-3ubuntu2.1~18.04.1 (For technical support please see http://www.ubuntu.com/support) 
Current version of pixman: 0.34.0
    Before reporting problems, check http://wiki.x.org
    to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
    (++) from command line, (!!) notice, (II) informational,
    (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Sat Jun 22 13:47:29 2019
(==) Using config file: "/etc/X11/xorg.conf"
(==) Using system config directory "/usr/share/X11/xorg.conf.d"
(EE) 
Fatal server error:
(EE) NVIDIA: A GPU exception occurred during X server initialization(EE) 
(EE) 
Please consult the The X.Org Foundation support 
     at http://wiki.x.org
 for help. 
(EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
(EE) 
(EE) Server terminated with error (1). Closing log file.
xinit: giving up
xinit: unable to connect to X server: Connection refused
xinit: server error

/var/log/Xorg.0.log:https://pastebin.com/ygxRKPpg

在这些日志中,有两件事引起了我的注意:

[   119.994] (II) NVIDIA(0): Virtual screen size determined to be 640 x 480
[   119.994] (WW) NVIDIA(0): Unable to get display device for DPI computation.

[   119.994] (--) NVIDIA(0): Memory: 3145728 kBytes
[   119.994] (II) NVIDIA: Using 6144.00 MB of virtual memory for indirect memory

好像我的显示设备没有被正确检测到和/或 X 服务器试图使用太多内存?

dmesg输出:https://pastebin.com/fcYMPrUB

相关部分:

[  120.275346] NVRM: GPU at PCI:0000:01:00: GPU-c588f20e-6b26-3352-5b81-666db3c970a2
[  120.275348] NVRM: Xid (PCI:0000:01:00): 44, Ch 00000000, engmask 00000101, intr 10000000
[  120.793329] NVRM: Xid (PCI:0000:01:00): 31, Ch 00000008, engmask 00000111, intr 10000000

我检查了 Xid 的含义:https://docs.nvidia.com/deploy/xid-errors/index.html

31 GPU memory page fault

44 Graphics Engine fault during context switch

nvidia-smi输出:

Sat Jun 22 14:23:52 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.116                Driver Version: 390.116                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 870M    Off  | 00000000:01:00.0 N/A |                  N/A |
| N/A   83C    P0    N/A /  N/A |      0MiB /  3018MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0                    Not Supported                                       |
+-----------------------------------------------------------------------------+

任何帮助都将不胜感激,谢谢。

答案1

您的驱动程序似乎无法找到显示器分辨率。您可能需要手动设置。您试过这个吗?

对于不发送 EDID 的显示器,如何使用 Nvidia 驱动程序设置正确的显示器分辨率?

答案2

以下是我的dmesg样子:

$ dmesg | grep -i nvidia
[    1.517472] nvidia: loading out-of-tree module taints kernel.
[    1.517477] nvidia: module license 'NVIDIA' taints kernel.
[    1.520410] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[    1.524609] nvidia-nvlink: Nvlink Core is being initialized, major device number 242
[    1.524802] nvidia 0000:01:00.0: enabling device (0006 -> 0007)
[    1.524981] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  384.130  Wed Mar 21 03:37:26 PDT 2018 (using threaded interrupts)
[    1.530574] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  384.130  Wed Mar 21 02:59:49 PDT 2018
[    1.531818] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[    1.531820] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1
[    4.318800] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 240
[    4.864567] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input9
[    7.517965] nvidia-modeset: Allocated GPU:0 (GPU-30fab9bc-fe6f-ec05-e8e6-c151a1a96121) @ PCI:0000:01:00.0

您的dmesg代码中多了两行:

[   16.317773] nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[   16.504557] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input13

dmesg缺少一行:

[    7.517965] nvidia-modeset: Allocated GPU:0 (GPU-30fab9bc-fe6f-ec05-e8e6-c151a1a96121) @ PCI:0000:01:00.0

我的系统是 Skylake 6700HQ,配备 nVidia GTX 970M,所以它和你的非常接近。384.130从第一天起,我就一直使用驱动程序,并且非常成功,从未改变过。我只有一个怪癖,就是 Windows 会打开 nVidia 卡的声音,但 Linux 却不会。所以我不得不应用一个名为的补丁,nvhda以便将 HDMI 声音发送到我的电视上。

相关内容