我知道我运行的 Tesla K80 并非在其首选的环境中,但我见过普通消费者成功使用该产品,因此我也亲自尝试了一下。https://blog.thomasjungblut.com/random/running-tesla-k80/
我跟着这,使用更新一点的 Ubuntu 版本(Ubuntu 22.04.2 LTS)和来自的驱动程序这里,但显卡内存未分配
/sbin/lspci -d "10de:*" -v -xxx
03:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
Subsystem: NVIDIA Corporation Device 106c
Flags: fast devsel, IRQ 16
Memory at <unassigned> (64-bit, prefetchable) [disabled]
Memory at <unassigned> (64-bit, prefetchable) [disabled]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Endpoint, MSI 00
Capabilities: [100] Virtual Channel
Capabilities: [128] Power Budgeting <?>
Capabilities: [420] Advanced Error Reporting
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900] #19
Kernel modules: nouveau, nvidia_drm, nvidia
我认为这是在夺走记忆
[ 0.193803] pnp 00:00: disabling [mem 0xfed40000-0xfed44fff] because it overlaps 0000:03:00.0 BAR 1 [mem 0x00000000-0x3ffffffff 64bit pref]
[ 0.193803] pnp 00:00: disabling [mem 0xfed40000-0xfed44fff disabled] because it overlaps 0000:04:00.0 BAR 1 [mem 0x00000000-0x3ffffffff 64bit pref]
该问题的另一个表现是我的 tty2 被这些消息淹没了:
[ 606.073710] NVRM: The NVIDIA probe routine failed for 2 device(s).
[ 606.073711] NVRM: None of the NVIDIA devices were initialized.
[ 606.073942] nvidia-nvlink: Unregistered the Nvlink Core, major device number 511
[ 606.553951] nvidia-nvlink: Nvlink Core is being initialized, major device number 511
[ 606.553958] NVRM: request_mem_region failed for 0M @ 0x0. This can
NVRM: occur when a driver such as rivatv is loaded and claims
NVRM: ownership of the device's registers.
[ 606.554683] nvidia: probe of 0000:03:00.0 failed with error -1
因此,安装的唯一其他图形驱动程序(例如 nouveau)没有被列入黑名单,而英特尔 i915 则被列入黑名单,这导致我出现这些错误
16.326014] NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x22:0xffff:667)
[ 16.326053] NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0
[ 16.444268] NVRM: GPU 0000:04:00.0: RmInitAdapter failed! (0x22:0xffff:667)
[ 16.444304] NVRM: GPU 0000:04:00.0: rm_init_adapter failed, device minor number 1
[ 16.562258] NVRM: GPU 0000:04:00.0: RmInitAdapter failed! (0x22:0xffff:667)
由于 Nvidia Tesla K80 没有视频输出,我更喜欢安装集成显卡驱动程序。
~> mokutil --sb-state
SecureBoot disabled
Platform is in Setup Mode
以下是有关我的系统的更多信息nvidia-错误报告.sh