我正在尝试qemu-kvm
通过 Ubuntu 14.04.2 上的 VFIO 驱动程序将我的 Tesla K40m GPU 加速器传递到虚拟机。
我下载了所有驱动程序和 CUDA 库,并成功编译了所有示例文件;但是当我运行它们时,它们没有完成。例如,这是运行日志deviceQuery
:
deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "Tesla K40m" //INFO ABOUT IT Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.0, CUDA Runtime Version = 7.0, NumDevs = 1, Device0 = Tesla K40m Result = PASS
Ctrl然后它就挂了,只剩下按+退出的唯一选项C。此外,我也在主机上安装了所有内容,并且在主机上,它成功完成,没有任何问题。任何帮助都将不胜感激。
dmesg on VM says only: [ 1475.225692] nvidia 0000:00:08.0: irq 51 for MSI/MSI-X dmesg on host: kernel: [ 2897.503162] vfio-pci 0000:02:00.0: irq 324 for MSI/MSI-X
此外,对 PCI 的任何调用都花费了太多时间。例如,我尝试nvidia-smi
在虚拟机 (VM) 和主机系统上调用并通过 进行跟踪strace
。以下是来自 VM 的输出:
+------------------------------------------------------+
| NVIDIA-SMI 346.59 Driver Version: 346.59 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K40m Off | 0000:00:06.0 Off | 0 |
| N/A 54C P0 64W / 235W | 55MiB / 11519MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
98.67 4.688353 275785 17 open
1.08 0.051337 3020 17 close
0.23 0.010722 104 103 ioctl
0.01 0.000261 22 12 read
0.00 0.000235 9 26 mmap
0.00 0.000177 15 12 write
0.00 0.000127 16 8 munmap
0.00 0.000107 11 10 mprotect
0.00 0.000094 19 5 1 stat
0.00 0.000070 5 15 fstat
0.00 0.000055 8 7 7 access
0.00 0.000030 30 1 execve
0.00 0.000018 5 4 fcntl
0.00 0.000015 8 2 1 futex
0.00 0.000013 4 3 brk
0.00 0.000007 4 2 rt_sigaction
0.00 0.000006 6 1 getrlimit
0.00 0.000005 5 1 lseek
0.00 0.000004 4 1 set_robust_list
0.00 0.000003 3 1 rt_sigprocmask
0.00 0.000003 3 1 arch_prctl
0.00 0.000003 3 1 set_tid_address
------ ----------- ----------- --------- --------- ----------------
100.00 4.751645 250 9 total
这是我nvidia-smi
在主机上运行时的输出(我事先将其与虚拟机分离):
+------------------------------------------------------+
| NVIDIA-SMI 346.59 Driver Version: 346.59 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K40m Off | 0000:02:00.0 Off | 0 |
| N/A 48C P0 64W / 235W | 55MiB / 11519MiB | 60% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
82.25 0.571723 33631 17 open
15.70 0.109104 6418 17 close
1.76 0.012264 119 103 ioctl
0.10 0.000664 44 15 read
0.05 0.000370 14 26 mmap
0.02 0.000155 16 10 mprotect
0.02 0.000152 22 7 7 access
0.02 0.000134 9 15 fstat
0.01 0.000100 13 8 munmap
0.01 0.000078 26 3 brk
0.01 0.000070 6 12 write
0.01 0.000069 17 4 fcntl
0.01 0.000062 62 1 execve
0.00 0.000029 6 5 1 stat
0.00 0.000021 11 2 rt_sigaction
0.00 0.000021 11 2 1 futex
0.00 0.000010 10 1 rt_sigprocmask
0.00 0.000010 10 1 getrlimit
0.00 0.000010 10 1 arch_prctl
0.00 0.000010 10 1 set_tid_address
0.00 0.000009 9 1 set_robust_list
0.00 0.000000 0 1 lseek
------ ----------- ----------- --------- --------- ----------------
100.00 0.695065 253 9 total
如您所见,open
VM 中的“ ”花费了太多时间。我不知道为什么。
有人能帮帮我吗?抱歉写了这么多。
更新 1:忘记添加服务器是 SuperServer 8047R、主板:Super X9QRi-F+、内存:128GB