QEMU/KVM virt-manager 无法通过 GPU 直通:组不可行

QEMU/KVM virt-manager 无法通过 GPU 直通:组不可行

我的电脑规格:

  • 全新安装 Ubuntu 22.04.2 LTS
  • AMD 锐龙 5 5600x
  • 32GB 内存
  • 2TB NVME
  • Radeon rx 6500 xt 和 Radeon rx 5700
  • MSI B450 Pro Carbon 主板

尝试过的指南:

已进行手动更改:

  • /etc/default/grub 文件,变量 GRUB_CMDLINE_LINUX_DEFAULT
  • /etc/modprobe.d/vfio.conf 创建和填充

因此,我尝试通过 KVM 和 virt-manager 在 Windows 11 虚拟机上主要运行 Ubuntu,并将我的 Radeon rx 6500 xt 专用于 Windows VM。错误总是大致相同(此实例是通过将 /etc/default/grub 行更改为 生成的GRUB_CMDLINE_LINUX_DEFAULT="quiet splash amd_iommu=on kvm.ignore_msrs=1 vfio-pci.ids=1002:743f,1002:ab28"):

“无法完成安装:'内部错误:qemu 意外关闭了监视器:2023-06-16T04:57:33.859580Z qemu-system-x86_64:-device vfio-pci,host=0000:27:00.0:组 15 不可行

请确保 iommu_group 内的所有设备都绑定到它们的 vfio 总线驱动程序。'”

以下是详细信息:

Unable to complete install: 'internal error: qemu unexpectedly closed the monitor: 2023-06-16T04:57:33.859580Z qemu-system-x86_64: -device vfio-pci,host=0000:27:00.0,id=hostdev0,bus=pci.4,addr=0x0: vfio 0000:27:00.0: group 15 is not viable
Please ensure all devices within the iommu_group are bound to their vfio bus driver.'

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 72, in cb_wrapper
    callback(asyncjob, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/createvm.py", line 2008, in _do_async_install
    installer.start_install(guest, meter=meter)
  File "/usr/share/virt-manager/virtinst/install/installer.py", line 695, in start_install
    domain = self._create_guest(
  File "/usr/share/virt-manager/virtinst/install/installer.py", line 637, in _create_guest
    domain = self.conn.createXML(initial_xml or final_xml, 0)
  File "/usr/lib/python3/dist-packages/libvirt.py", line 4400, in createXML
    raise libvirtError('virDomainCreateXML() failed')
libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2023-06-16T04:57:33.859580Z qemu-system-x86_64: -device vfio-pci,host=0000:27:00.0,id=hostdev0,bus=pci.4,addr=0x0: vfio 0000:27:00.0: group 15 is not viable
Please ensure all devices within the iommu_group are bound to their vfio bus driver.

为了提供相关的输出,首先这是我在 Hueber 指南中找到的按 IOMMU 组列出设备的脚本:

#!/bin/bash
shopt -s nullglob
for d in /sys/kernel/iommu_groups/*/devices/*; do
    n=${d#*/iommu_groups/*}; n=${n%%/*}
    printf 'IOMMU Group %s ' "$n"
    lspci -nns "${d##*/}"
done;

输出结果如下:

IOMMU Group 0 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 10 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 11 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] [1022:1484]
IOMMU Group 12 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 61)
IOMMU Group 12 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU Group 13 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 0 [1022:1440]
IOMMU Group 13 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 1 [1022:1441]
IOMMU Group 13 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 2 [1022:1442]
IOMMU Group 13 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 3 [1022:1443]
IOMMU Group 13 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 4 [1022:1444]
IOMMU Group 13 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 5 [1022:1445]
IOMMU Group 13 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 6 [1022:1446]
IOMMU Group 13 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 7 [1022:1447]
IOMMU Group 14 01:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO [144d:a80a]
IOMMU Group 15 03:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
IOMMU Group 15 03:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01)
IOMMU Group 15 03:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge [1022:43c6] (rev 01)
IOMMU Group 15 20:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
IOMMU Group 15 20:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
IOMMU Group 15 20:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
IOMMU Group 15 21:00.0 Network controller [0280]: Intel Corporation Wireless-AC 9260 [8086:2526] (rev 29)
IOMMU Group 15 22:00.0 Ethernet controller [0200]: Intel Corporation I211 Gigabit Network Connection [8086:1539] (rev 03)
IOMMU Group 15 25:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch [1002:1478] (rev c1)
IOMMU Group 15 26:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch [1002:1479]
IOMMU Group 15 27:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:743f] (rev c1)
IOMMU Group 15 27:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 HDMI Audio [Radeon RX 6800/6800 XT / 6900 XT] [1002:ab28]
IOMMU Group 16 28:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch [1002:1478] (rev c4)
IOMMU Group 17 29:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch [1002:1479]
IOMMU Group 18 2a:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] [1002:731f] (rev c4)
IOMMU Group 19 2a:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 HDMI Audio [1002:ab38]
IOMMU Group 1 00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
IOMMU Group 20 2b:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Function [1022:148a]
IOMMU Group 21 2c:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP [1022:1485]
IOMMU Group 22 2c:00.1 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP [1022:1486]
IOMMU Group 23 2c:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller [1022:149c]
IOMMU Group 24 2c:00.4 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse HD Audio Controller [1022:1487]
IOMMU Group 2 00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
IOMMU Group 3 00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 4 00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 5 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
IOMMU Group 6 00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 7 00:05.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 8 00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 9 00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] [1022:1484]

至于设备正在运行的内核驱动程序,正在运行sudo -i && lspci -k

00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root Complex
    Subsystem: Micro-Star International Co., Ltd. [MSI] Starship/Matisse Root Complex
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Starship/Matisse IOMMU
    Subsystem: Micro-Star International Co., Ltd. [MSI] Starship/Matisse IOMMU
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge
    Kernel driver in use: pcieport
00:01.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge
    Kernel driver in use: pcieport
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge
    Kernel driver in use: pcieport
00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:05.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
    Kernel driver in use: pcieport
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
    Kernel driver in use: pcieport
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 61)
    Subsystem: Micro-Star International Co., Ltd. [MSI] FCH SMBus Controller
    Kernel driver in use: piix4_smbus
    Kernel modules: i2c_piix4, sp5100_tco
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
    Subsystem: Micro-Star International Co., Ltd. [MSI] FCH LPC Bridge
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 3
    Kernel driver in use: k10temp
    Kernel modules: k10temp
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 5
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 6
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 7
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
    Subsystem: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
    Kernel driver in use: nvme
    Kernel modules: nvme
03:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller (rev 01)
    Subsystem: ASMedia Technology Inc. 400 Series Chipset USB 3.1 XHCI Controller
    Kernel driver in use: xhci_hcd
    Kernel modules: xhci_pci
03:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller (rev 01)
    Subsystem: ASMedia Technology Inc. 400 Series Chipset SATA Controller
    Kernel driver in use: ahci
    Kernel modules: ahci
03:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge (rev 01)
    Kernel driver in use: pcieport
20:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
    Kernel driver in use: pcieport
20:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
    Kernel driver in use: pcieport
20:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
    Kernel driver in use: pcieport
21:00.0 Network controller: Intel Corporation Wireless-AC 9260 (rev 29)
    DeviceName: Broadcom 5762
    Subsystem: Intel Corporation Wireless-AC 9260
    Kernel driver in use: iwlwifi
    Kernel modules: iwlwifi
22:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)
    Subsystem: Micro-Star International Co., Ltd. [MSI] I211 Gigabit Network Connection
    Kernel driver in use: igb
    Kernel modules: igb
25:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev c1)
    Kernel driver in use: pcieport
26:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch
    Kernel driver in use: pcieport
27:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 743f (rev c1)
    Subsystem: Micro-Star International Co., Ltd. [MSI] Device 5080
    Kernel driver in use: vfio-pci
    Kernel modules: amdgpu
27:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 HDMI Audio [Radeon RX 6800/6800 XT / 6900 XT]
    Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 HDMI Audio [Radeon RX 6800/6800 XT / 6900 XT]
    Kernel driver in use: vfio-pci
    Kernel modules: snd_hda_intel
28:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev c4)
    Kernel driver in use: pcieport
29:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch
    Kernel driver in use: pcieport
2a:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] (rev c4)
    Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Reference RX 5700 XT
    Kernel driver in use: amdgpu
    Kernel modules: amdgpu
2a:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 HDMI Audio
    Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 HDMI Audio
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel
2b:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Function
    Subsystem: Micro-Star International Co., Ltd. [MSI] Starship/Matisse PCIe Dummy Function
2c:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP
    Subsystem: Micro-Star International Co., Ltd. [MSI] Starship/Matisse Reserved SPP
2c:00.1 Encryption controller: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP
    Subsystem: Micro-Star International Co., Ltd. [MSI] Starship/Matisse Cryptographic Coprocessor PSPCPP
    Kernel driver in use: ccp
    Kernel modules: ccp
2c:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller
    Subsystem: Micro-Star International Co., Ltd. [MSI] Matisse USB 3.0 Host Controller
    Kernel driver in use: xhci_hcd
    Kernel modules: xhci_pci
2c:00.4 Audio device: Advanced Micro Devices, Inc. [AMD] Starship/Matisse HD Audio Controller
    Subsystem: Micro-Star International Co., Ltd. [MSI] Starship/Matisse HD Audio Controller
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel

最后,AMD-Vi 状态来自dmesg |grep AMD-Vi(我确保按照 Mental Outlaw 在 BIOS 中启用所有内容):

[    0.632427] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.634569] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.634571] AMD-Vi: Extended features (0x58f77ef22294a5a): PPR NX GT IA PC GA_vAPIC
[    0.634575] AMD-Vi: Interrupt remapping enabled
[    4.653384] AMD-Vi: AMD IOMMUv2 loaded and initialized

我要说的是,我尝试过通过所有 IOMMU 组 15,但重新启动后它禁用了我的键盘和 LAN 适配器以及我的大多数其他 USB 外围设备,所以我立即恢复了尝试。

我希望这足够了,我愿意根据要求提供更多信息。但是我能做什么来解决这个问题呢?我需要大多数其他第 15 组设备才能让我的主机可用,并且一些轻量级浏览似乎无法将 GPU 强制放入新的空 IOMMU 组中。

一如往常,我感谢一切帮助!

相关内容