我跟着https://mathiashueber.com/windows-virtual-machine-gpu-passthrough-ubuntu/。但是,有一件事我没有遵循:我保留了 noveau 而不是官方驱动程序,因为如果我按照它说的做,当我重新启动时,我只会看到黑屏。而且我想在主机上使用 noveau,而不是专有且可能不安全的驱动程序。
我在技嘉 B450m 主板上安装了 Ryzen 7 2700X。我有一台 GTX 1060 想放在虚拟机里,还有一台 GT 750 想在主机上使用。
AMD-Vi 工作原理:
lz@z:~$ dmesg |grep AMD-Vi
[ 0.327637] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[ 0.330500] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 0.330501] pci 0000:00:00.2: AMD-Vi: Extended features (0xf77ef22294ada):
[ 0.330504] AMD-Vi: Interrupt remapping enabled
[ 0.330505] AMD-Vi: Virtual APIC enabled
[ 0.330572] AMD-Vi: Lazy IO/TLB flushing enabled
这是我的 IOMMU 组:
IOMMU Group 0 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 10 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU Group 11 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
IOMMU Group 11 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU Group 12 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
IOMMU Group 12 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
IOMMU Group 12 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
IOMMU Group 12 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
IOMMU Group 12 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
IOMMU Group 12 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
IOMMU Group 12 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
IOMMU Group 12 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
IOMMU Group 13 01:00.0 Non-Volatile memory controller [0108]: Kingston Technology Company, Inc. Device [2646:2263] (rev 03)
IOMMU Group 14 02:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
IOMMU Group 14 02:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01)
IOMMU Group 14 02:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge [1022:43c6] (rev 01)
IOMMU Group 14 03:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
IOMMU Group 14 03:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
IOMMU Group 14 03:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
IOMMU Group 14 05:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 0c)
IOMMU Group 14 06:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108 [GeForce GT 730] [10de:0f02] (rev a1)
IOMMU Group 14 06:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio Controller [10de:0bea] (rev a1)
>>>>>>>>>>>>>>> IOMMU Group 15 07:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:2184] (rev a1)
>>>>>>>>>>>>>>> IOMMU Group 15 07:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:1aeb] (rev a1)
>>>>>>>>>>>>>>> IOMMU Group 15 07:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1aec] (rev a1)
>>>>>>>>>>>>>>> IOMMU Group 15 07:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1aed] (rev a1)
IOMMU Group 16 08:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
IOMMU Group 17 08:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
IOMMU Group 18 08:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 Host controller [1022:145f]
IOMMU Group 19 09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
IOMMU Group 1 00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU Group 20 09:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU Group 21 09:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller [1022:1457]
IOMMU Group 2 00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU Group 3 00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 4 00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 5 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU Group 6 00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 7 00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 8 00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU Group 9 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
您可以看到我的 GTX1060 在第 15 组,以及其他我不关心的东西,它们也可以进入虚拟机。例如 USB 控制器。
Soi 我有 10de:2184(GTX 1060)和 10de:1aeb(GTX Audio)。我是否需要保存第 15 组中其他事物的 ID?我要尝试使用所有这些,所以我保存了 10de:1aec (USB) 和 10de:1aed (串行总线)
lz@z:~$ cat /etc/initramfs-tools/modules
# List of modules that you want to include in your initramfs.
# They will be loaded at boot time in the order below.
#
# Syntax: module_name [args ...]
#
# You must run update-initramfs(8) to effect this change.
#
# Examples:
#
# raid1
# sd_mod
vfio vfio_iommu_type1 vfio_virqfd vfio_pci ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed
和
lz@z:~$ cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
vfio vfio_iommu_type1 vfio_pci ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed
和
lz@z:~$ cat /etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed
和
lz@z:~$ cat /etc/modprobe.d/kvm.conf
options kvm ignore_msrs=1
现在看看我的 lspci重启后:
lz@z:~$ lspci -nnv
00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Root Complex [1022:1450]
Subsystem: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Root Complex [1022:1450]
Flags: fast devsel
00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) I/O Memory Management Unit [1022:1451]
Subsystem: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) I/O Memory Management Unit [1022:1451]
Flags: fast devsel, IRQ 25
Capabilities: <access denied>
00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
Flags: fast devsel
00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453] (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 26
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: None
Memory behind bridge: f7600000-f76fffff [size=1M]
Prefetchable memory behind bridge: None
Capabilities: <access denied>
Kernel driver in use: pcieport
00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453] (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 27
Bus: primary=00, secondary=02, subordinate=06, sec-latency=0
I/O behind bridge: 0000d000-0000efff [size=8K]
Memory behind bridge: f4000000-f53fffff [size=20M]
Prefetchable memory behind bridge: 00000000e8000000-00000000f21fffff [size=162M]
Capabilities: <access denied>
Kernel driver in use: pcieport
00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
Flags: fast devsel
00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
Flags: fast devsel
00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453] (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 28
Bus: primary=00, secondary=07, subordinate=07, sec-latency=0
I/O behind bridge: 0000f000-0000ffff [size=4K]
Memory behind bridge: f6000000-f70fffff [size=17M]
Prefetchable memory behind bridge: 00000000d0000000-00000000e20fffff [size=289M]
Capabilities: <access denied>
Kernel driver in use: pcieport
00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
Flags: fast devsel
00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
Flags: fast devsel
00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454] (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 29
Bus: primary=00, secondary=08, subordinate=08, sec-latency=0
I/O behind bridge: None
Memory behind bridge: f7200000-f74fffff [size=3M]
Prefetchable memory behind bridge: None
Capabilities: <access denied>
Kernel driver in use: pcieport
00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
Flags: fast devsel
00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454] (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 31
Bus: primary=00, secondary=09, subordinate=09, sec-latency=0
I/O behind bridge: None
Memory behind bridge: f7500000-f75fffff [size=1M]
Prefetchable memory behind bridge: None
Capabilities: <access denied>
Kernel driver in use: pcieport
00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
Subsystem: Gigabyte Technology Co., Ltd FCH SMBus Controller [1458:5001]
Flags: 66MHz, medium devsel
Kernel modules: i2c_piix4, sp5100_tco
00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
Subsystem: Gigabyte Technology Co., Ltd FCH LPC Bridge [1458:5001]
Flags: bus master, 66MHz, medium devsel, latency 0
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
Flags: fast devsel
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
Flags: fast devsel
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
Flags: fast devsel
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
Flags: fast devsel
Kernel driver in use: k10temp
Kernel modules: k10temp
00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
Flags: fast devsel
00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
Flags: fast devsel
00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
Flags: fast devsel
00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
Flags: fast devsel
01:00.0 Non-Volatile memory controller [0108]: Kingston Technology Company, Inc. Device [2646:2263] (rev 03) (prog-if 02 [NVM Express])
Subsystem: Kingston Technology Company, Inc. Device [2646:2263]
Flags: bus master, fast devsel, latency 0, IRQ 60, NUMA node 0
Memory at f7600000 (64-bit, non-prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: nvme
Kernel modules: nvme
02:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01) (prog-if 30 [XHCI])
Subsystem: ASMedia Technology Inc. 400 Series Chipset USB 3.1 XHCI Controller [1b21:1142]
Flags: bus master, fast devsel, latency 0, IRQ 30
Memory at f53a0000 (64-bit, non-prefetchable) [size=32K]
Capabilities: <access denied>
Kernel driver in use: xhci_hcd
02:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01) (prog-if 01 [AHCI 1.0])
Subsystem: ASMedia Technology Inc. 400 Series Chipset SATA Controller [1b21:1062]
Flags: bus master, fast devsel, latency 0, IRQ 59
Memory at f5380000 (32-bit, non-prefetchable) [size=128K]
Expansion ROM at f5300000 [disabled] [size=512K]
Capabilities: <access denied>
Kernel driver in use: ahci
Kernel modules: ahci
02:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge [1022:43c6] (rev 01) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 33
Bus: primary=02, secondary=03, subordinate=06, sec-latency=0
I/O behind bridge: 0000d000-0000efff [size=8K]
Memory behind bridge: f4000000-f52fffff [size=19M]
Prefetchable memory behind bridge: 00000000e8000000-00000000f21fffff [size=162M]
Capabilities: <access denied>
Kernel driver in use: pcieport
03:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01) (prog-if 00 [Normal decode])
DeviceName: Broadcom 5762
Flags: bus master, fast devsel, latency 0, IRQ 34
Bus: primary=03, secondary=04, subordinate=04, sec-latency=0
I/O behind bridge: None
Memory behind bridge: None
Prefetchable memory behind bridge: None
Capabilities: <access denied>
Kernel driver in use: pcieport
03:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 36
Bus: primary=03, secondary=05, subordinate=05, sec-latency=0
I/O behind bridge: 0000e000-0000efff [size=4K]
Memory behind bridge: f5200000-f52fffff [size=1M]
Prefetchable memory behind bridge: 00000000f2100000-00000000f21fffff [size=1M]
Capabilities: <access denied>
Kernel driver in use: pcieport
03:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 37
Bus: primary=03, secondary=06, subordinate=06, sec-latency=0
I/O behind bridge: 0000d000-0000dfff [size=4K]
Memory behind bridge: f4000000-f50fffff [size=17M]
Prefetchable memory behind bridge: 00000000e8000000-00000000f1ffffff [size=160M]
Capabilities: <access denied>
Kernel driver in use: pcieport
05:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 0c)
Subsystem: Gigabyte Technology Co., Ltd Onboard Ethernet [1458:e000]
Flags: bus master, fast devsel, latency 0, IRQ 35
I/O ports at e000 [size=256]
Memory at f5200000 (64-bit, non-prefetchable) [size=4K]
Memory at f2100000 (64-bit, prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: r8169
Kernel modules: r8169
06:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108 [GeForce GT 730] [10de:0f02] (rev a1) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation GF108 [GeForce GT 730] [10de:0825]
Flags: bus master, fast devsel, latency 0, IRQ 86
Memory at f4000000 (32-bit, non-prefetchable) [size=16M]
Memory at e8000000 (64-bit, prefetchable) [size=128M]
Memory at f0000000 (64-bit, prefetchable) [size=32M]
I/O ports at d000 [size=128]
Expansion ROM at f5000000 [disabled] [size=512K]
Capabilities: <access denied>
Kernel driver in use: nouveau
Kernel modules: nvidiafb, nouveau
06:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio Controller [10de:0bea] (rev a1)
Subsystem: NVIDIA Corporation GF108 High Definition Audio Controller [10de:0825]
Flags: bus master, fast devsel, latency 0, IRQ 35
Memory at f5080000 (32-bit, non-prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
>>>>>>>>>>>>>>>> 07:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:2184] (rev a1) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:1324]
Flags: bus master, fast devsel, latency 0, IRQ 11
Memory at f6000000 (32-bit, non-prefetchable) [size=16M]
Memory at d0000000 (64-bit, prefetchable) [size=256M]
Memory at e0000000 (64-bit, prefetchable) [size=32M]
I/O ports at f000 [size=128]
Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau
>>>>>>>>>>>>>>>> 07:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:1aeb] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:1324]
Flags: bus master, fast devsel, latency 0, IRQ 83
Memory at f7080000 (32-bit, non-prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
>>>>>>>>>>>>>>>> 07:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1aec] (rev a1) (prog-if 30 [XHCI])
Subsystem: NVIDIA Corporation Device [10de:1324]
Flags: fast devsel, IRQ 47
Memory at e2000000 (64-bit, prefetchable) [size=256K]
Memory at e2040000 (64-bit, prefetchable) [size=64K]
Capabilities: <access denied>
Kernel driver in use: xhci_hcd
>>>>>>>>>>>>>>>> 07:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1aed] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:1324]
Flags: bus master, fast devsel, latency 0, IRQ 58
Memory at f7084000 (32-bit, non-prefetchable) [size=4K]
Capabilities: <access denied>
Kernel driver in use: nvidia-gpu
Kernel modules: i2c_nvidia_gpu
08:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
Subsystem: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
Flags: fast devsel
Capabilities: <access denied>
08:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
Subsystem: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
Flags: bus master, fast devsel, latency 0, IRQ 80
Memory at f7300000 (32-bit, non-prefetchable) [size=1M]
Memory at f7400000 (32-bit, non-prefetchable) [size=8K]
Capabilities: <access denied>
Kernel driver in use: ccp
Kernel modules: ccp
08:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 Host controller [1022:145f] (prog-if 30 [XHCI])
Subsystem: Gigabyte Technology Co., Ltd Zeppelin USB 3.0 Host controller [1458:5007]
Flags: bus master, fast devsel, latency 0, IRQ 48
Memory at f7200000 (64-bit, non-prefetchable) [size=1M]
Capabilities: <access denied>
Kernel driver in use: xhci_hcd
09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
Subsystem: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
Flags: fast devsel
Capabilities: <access denied>
09:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51) (prog-if 01 [AHCI 1.0])
Subsystem: Gigabyte Technology Co., Ltd FCH SATA Controller [AHCI mode] [1458:b002]
Flags: bus master, fast devsel, latency 0, IRQ 63
Memory at f7508000 (32-bit, non-prefetchable) [size=4K]
Capabilities: <access denied>
Kernel driver in use: ahci
Kernel modules: ahci
09:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller [1022:1457]
Subsystem: Gigabyte Technology Co., Ltd Family 17h (Models 00h-0fh) HD Audio Controller [1458:a182]
Flags: bus master, fast devsel, latency 0, IRQ 85
Memory at f7500000 (32-bit, non-prefetchable) [size=32K]
Capabilities: <access denied>
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
我突出显示了第 15 组中的设备。只有 NVIDIA GTX 1060 正在被使用vfio-pci
,其他的正在被其他内核模块使用。这就是问题的根源吗?为了通过 GTX,我必须通过第 15 组,但这些其他的东西正在被其他司机使用,而不是vfio-pci
。
Unable to complete install: 'internal error: qemu unexpectedly closed the monitor: 2020-02-19T22:48:02.001713Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
2020-02-19T22:48:02.002255Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
2020-02-19T22:48:02.002845Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
2020-02-19T22:48:02.003340Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
2020-02-19T22:48:02.003842Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
2020-02-19T22:48:02.024485Z qemu-system-x86_64: -device vfio-pci,host=07:00.0,id=hostdev0,bus=pci.4,addr=0x0: vfio 0000:07:00.0: group 15 is not viable
Please ensure all devices within the iommu_group are bound to their vfio bus driver.'
请看一下
请确保 iommu_group 内的所有设备都已绑定到其 vfio 总线驱动程序
这证实了我的想法,vfio-pci
尽管我明确地告诉过他们,但并非所有设备都得到了帮助。
我认为他在这一部分中做到了这一点,但对于 nvidia 驱动程序:
为了在 nvidia 驱动程序之前改变加载顺序以利于 vfio_pci,请通过 sudo nano /etc/modprobe.d/nvidia.conf 在 modprobe.d 文件夹中创建一个文件并添加以下行:
softdep nouveau 之前:vfio-pci softdep nvidia 之前:vfio-pci softdep nvidia* 之前:vfio-pci
有没有办法对 noveau 做同样的事情?
答案1
我发现有一种方法可以手动解除 pci 中特定设备的内核模块绑定,所以我写了一个小脚本
echo -n "0000:07:00.1" > /sys/bus/pci/drivers/snd_hda_intel/unbind
echo -n "0000:07:00.1" > /sys/bus/pci/drivers/vfio-pci/bind
echo -n "0000:07:00.2" > /sys/bus/pci/drivers/xhci_hcd/unbind
echo -n "0000:07:00.2" > /sys/bus/pci/drivers/vfio-pci/bind
echo -n "0000:07:00.3" > /sys/bus/pci/drivers/nvidia-gpu/unbind
echo -n "0000:07:00.3" > /sys/bus/pci/drivers/vfio-pci/bind
由于该行,它会挂起一段时间(例如 2 分钟),echo -n "0000:07:00.3" > /sys/bus/pci/drivers/nvidia-gpu/unbind
但当它完成时,这是输出lspci -nnv
:
7:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:2184] (rev a1) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:1324]
Flags: bus master, fast devsel, latency 0, IRQ 11
Memory at f6000000 (32-bit, non-prefetchable) [size=16M]
Memory at d0000000 (64-bit, prefetchable) [size=256M]
Memory at e0000000 (64-bit, prefetchable) [size=32M]
I/O ports at f000 [size=128]
Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau
07:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:1aeb] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:1324]
Flags: fast devsel, IRQ 83
Memory at f7080000 (32-bit, non-prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
07:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1aec] (rev a1) (prog-if 30 [XHCI])
Subsystem: NVIDIA Corporation Device [10de:1324]
Flags: fast devsel, IRQ 46
Memory at e2000000 (64-bit, prefetchable) [size=256K]
Memory at e2040000 (64-bit, prefetchable) [size=64K]
Capabilities: <access denied>
Kernel driver in use: vfio-pci
07:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1aed] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:1324]
Flags: fast devsel, IRQ 58
Memory at f7084000 (32-bit, non-prefetchable) [size=4K]
Capabilities: <access denied>
Kernel driver in use: vfio-pci
Kernel modules: i2c_nvidia_gpu
如您所见,它们都使用 vfio-pci。然后我简单地将 GPU 添加到 virt-manager 中,它就起作用了。然而,我仍在调查为什么在 Windows 10 安装过程中,整个 ubuntu 永远冻结了。
更新:
手动解除绑定可以解除 GPU 的绑定,但如果必须解除绑定,则意味着 GPU 的 Linux 驱动程序已经接触过 GPU,因此现在 GPU 知道它在 Linux 上。当您将其绑定到 VM 并启动 VM 时,GPU 的 Windows 驱动程序将读取 GPU 状态并知道有人(Linux)之前弄乱了它,因此将拒绝工作,因为 NVIDIA 很差劲。
不要手动解除绑定,或者至少尝试一下,但可能行不通。相反,确保 Linux 驱动程序永远不会接触 GPU
答案2
可能是的,但至少其中的三个应该由 vfio-pci 承担我突出显示了第 15 组中的设备。只有 NVIDIA GTX 1060 被 vfio-pci 使用,其他设备被其他内核模块使用。这是问题的根源吗?为了通过 GTX,我必须通过第 15 组中的所有内容,但这些其他东西被其他驱动程序使用,而不是 vfio-pci。
在我的安装了 gtx2070 的机器上,这些是:
- VGA 兼容控制器,
- 音频设备,
- 串行总线控制器
lspci -knn
GPU slot 1 GT 710
0b:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK208 [GeForce GT 710B] [10de:128b] (rev a1)
Subsystem: Micro-Star International Co., Ltd. [MSI] GK208B [GeForce GT 710] [1462:8c93]
Kernel driver in use: nvidia
Kernel modules: nvidia
0b:00.1 Audio device [0403]: NVIDIA Corporation GK208 HDMI/DP Audio Controller [10de:0e0f] (rev a1)
Subsystem: Micro-Star International Co., Ltd. [MSI] GK208 HDMI/DP Audio Controller [1462:8c93]
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
GPU slot 2 gtx2070
0c:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1e84] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd Device [1458:4008]
Kernel driver in use: vfio-pci
Kernel modules: nvidia
0c:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f8] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd Device [1458:4008]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
0c:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1ad8] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd Device [1458:4008]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
0c:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1ad9] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd Device [1458:4008]
Kernel driver in use: vfio-pci
我也按照以下说明进行操作https://mathiashueber.com/在设置我的机器时。
我在我的机器上安装了两个 gpu。在第一个插槽(将用于我的 linux 机器的插槽)中,我放了一块低能耗 nvidia 卡。在第二个插槽中,我安装了应该传递到 vm 的 gtx2070
我安装了虚拟机软件 [和其他工具,如 firmware-linux 或来自 debian buster backports 的较新内核]:
sudo apt install ovmf virt-manager qemu-kvm
激活 IOMMU(Bios 中的 vt-x/vt-d 等)并将以下行添加到 Grub:
GRUB_CMDLINE_LINUX_DEFAULT="amd_iommu=on iommu=pt kvm_amd.npt=1 kvm_amd.avic=1 kvm.ignore_msrs=1 video=vesafb:off,efifb:off disable_idle_d3=1"
确保我的 GPU 位于其自己的组中:
IOMMU Group 29 0c:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1e84] (rev a1)
IOMMU Group 29 0c:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f8] (rev a1)
IOMMU Group 29 0c:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1ad8] (rev a1)
IOMMU Group 29 0c:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1ad9] (rev a1)
- 然后我添加了我想在启动时由 vfio-pci 绑定的 id
sudo nano /etc/initramfs-tools/modules
vfio_pci ids=10de:1e84,10de:10f8,10de:1ad8,10de:1ad9
sudo update-initramfs -u -k --all
- 之后,重新启动并重新检查
lspci -knn
它是否有效。正如你从上面的图片中看到的那样,它对 id 不起作用10de:1ad8。但幸运的是,这不是问题。我的 win10 vm 运行正常,尽管0c:00.2 USB 控制器未被 vfio-pci 占用
我使用的软件版本和内核是:
qemu-system-x86_64 --version
QEMU emulator version 5.0.0 (Debian 1:5.0-14~bpo10+1)
Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers
uname -r
5.7.0-0.bpo.2-amd64
关于如何成功通过 gpu 的整个主题非常复杂。可能会发生许多不同的问题。
以下是我的一些经验:
我记得当使用带有技嘉 ga-p55-ud7 的 Lubuntu 16.04 时,我在启动时无法将我的 gtx970 绑定到 pci-stub,因此我必须像您一样使用 bind/unbind 命令手动执行此操作。(将 Nvidia gpu 列入 qemu/kvm 直通的黑名单)
使用我的新机器 ROG STRIX X570-F GAMING 和 debian buster(如上所示),我能够启动,并且在启动过程中我的主卡(gt710)由 nvidia 驱动程序占用,我的 gtx2070 由 vfio-pci 占用。
使用另一台 ASRockRack EPYC3251D4I-2T 机器与 debian buster 结合使用时,我在尝试将我的 gtx970 传递到 Windows 客户机时遇到了大问题。为了规避这些问题,我不得不复制一个脚本并在后台运行它(请参阅https://www.reddit.com/r/Amd/comments/7gp1z7/threadripper_kvm_gpu_passthru_testers_needed/)
告诉你,为什么你会遇到这些问题,我不知道。也许是:软件过时了?你使用的发行版以及该发行版如何与加载模块交互?制造商未正确编程的 BIOS/主板固件?可能有可用的 BIOS 更新?