request_mem_region 因内存重叠而失败

request_mem_region 因内存重叠而失败

我知道我运行的 Tesla K80 并非在其首选的环境中,但我见过普通消费者成功使用该产品,因此我也亲自尝试了一下。https://blog.thomasjungblut.com/random/running-tesla-k80/

我跟着,使用更新一点的 Ubuntu 版本(Ubuntu 22.04.2 LTS)和来自的驱动程序这里,但显卡内存未分配

/sbin/lspci -d "10de:*" -v -xxx

03:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
    Subsystem: NVIDIA Corporation Device 106c
    Flags: fast devsel, IRQ 16
    Memory at <unassigned> (64-bit, prefetchable) [disabled]
    Memory at <unassigned> (64-bit, prefetchable) [disabled]
    Capabilities: [60] Power Management version 3
    Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
    Capabilities: [78] Express Endpoint, MSI 00
    Capabilities: [100] Virtual Channel
    Capabilities: [128] Power Budgeting <?>
    Capabilities: [420] Advanced Error Reporting
    Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
    Capabilities: [900] #19
    Kernel modules: nouveau, nvidia_drm, nvidia

我认为这是在夺走记忆

[    0.193803] pnp 00:00: disabling [mem 0xfed40000-0xfed44fff] because it overlaps 0000:03:00.0 BAR 1 [mem 0x00000000-0x3ffffffff 64bit pref]
[    0.193803] pnp 00:00: disabling [mem 0xfed40000-0xfed44fff disabled] because it overlaps 0000:04:00.0 BAR 1 [mem 0x00000000-0x3ffffffff 64bit pref]

该问题的另一个表现是我的 tty2 被这些消息淹没了:

[  606.073710] NVRM: The NVIDIA probe routine failed for 2 device(s).
[  606.073711] NVRM: None of the NVIDIA devices were initialized.
[  606.073942] nvidia-nvlink: Unregistered the Nvlink Core, major device number 511
[  606.553951] nvidia-nvlink: Nvlink Core is being initialized, major device number 511
[  606.553958] NVRM: request_mem_region failed for 0M @ 0x0. This can
               NVRM: occur when a driver such as rivatv is loaded and claims
               NVRM: ownership of the device's registers.
[  606.554683] nvidia: probe of 0000:03:00.0 failed with error -1

因此,安装的唯一其他图形驱动程序(例如 nouveau)没有被列入黑名单,而英特尔 i915 则被列入黑名单,这导致我出现这些错误

   16.326014] NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x22:0xffff:667)
[   16.326053] NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0
[   16.444268] NVRM: GPU 0000:04:00.0: RmInitAdapter failed! (0x22:0xffff:667)
[   16.444304] NVRM: GPU 0000:04:00.0: rm_init_adapter failed, device minor number 1
[   16.562258] NVRM: GPU 0000:04:00.0: RmInitAdapter failed! (0x22:0xffff:667)

由于 Nvidia Tesla K80 没有视频输出,我更喜欢安装集成显卡驱动程序。

~> mokutil --sb-state
SecureBoot disabled
Platform is in Setup Mode

以下是有关我的系统的更多信息nvidia-错误报告.sh

相关内容