我在 Ubuntu 22.04 (Jammy) LTS 上的单节点配置中使用 devstack 安装了 openstack。我按照以下教程在我的 openstack 上设置了 GPU 直通:https://superuser.openinfra.dev/articles/a-comprehensive-guide-to-configuring-gpu-passthrough-in-openstack-for-high-performance-computing/
我目前有一台 GTX 1630(我知道它不适合 HPC,但这是一个测试配置,计划以后再开发)。我的 BIOS 中启用了 VT-d。
我的 grub 和 initramfs 配置与文章中完全相同,我只是在配置文件中放置了正确的供应商 ID 和产品 ID。以下是有关硬件和驱动程序的一些详细信息:
$ sudo lspci -nn | grep NVIDIA
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1f83] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10fa] (rev a1)
$ sudo lspci -s 01:00.0 -k
01:00.0 VGA compatible controller: NVIDIA Corporation Device 1f83 (rev a1)
Subsystem: NVIDIA Corporation Device 169c
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau
$ sudo lspci -s 01:00.1 -k
01:00.1 Audio device: NVIDIA Corporation Device 10fa (rev a1)
Subsystem: NVIDIA Corporation Device 169c
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
以下是我添加到 nova 配置中的内容:
...
[pci]
device_spec = { "vendor_id":"10de", "product_id":"1f83" }
alias: { "vendor_id":"10de", "product_id":"1f83", "device_type":"type-PCI", "name":"geforce-gtx-1630" }
[filter_scheduler]
enabled_filters = PciPassthroughFilter
available_filters = nova.scheduler.filters.all_filters
这是我用来尝试启动实例的图像和风格:
$ openstack image show bc01668d-6716-4c19-8b20-f9e60f98a4dc
+------------------+-------------------------------------------------------------------------------------------------------+
| Field | Value |
+------------------+-------------------------------------------------------------------------------------------------------+
| checksum | fd981e3a7528b5911631886a03fa5693 |
| container_format | bare |
| created_at | 2023-11-20T07:06:27Z |
| disk_format | qcow2 |
| file | /v2/images/bc01668d-6716-4c19-8b20-f9e60f98a4dc/file |
| id | bc01668d-6716-4c19-8b20-f9e60f98a4dc |
| min_disk | 0 |
| min_ram | 0 |
| name | Ubuntu 20.04 LTS (Focal Fossa) |
| owner | 6209bfb566b749fe943f27521b7519ea |
| properties | img_hide_hypervisor_id='true', os_hash_algo='sha512', os_hash_value='48059a837a24997117c48456c985d9b0 |
| | d9c4fb89b2a0b81d6e9e9589f02216d925a8b3d18848acb89e0fb7cbccacc1fbb08d95115d444cca5cbb093cfa37e830', |
| | os_hidden='False' |
| protected | False |
| schema | /v2/schemas/image |
| size | 620167168 |
| status | active |
| tags | |
| updated_at | 2023-11-20T07:07:07Z |
| virtual_size | 2361393152 |
| visibility | private |
+------------------+-------------------------------------------------------------------------------------------------------+
$ openstack flavor show gpu_flavor
+----------------------------+--------------------------------------------+
| Field | Value |
+----------------------------+--------------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| access_project_ids | None |
| description | None |
| disk | 25 |
| id | 025649e0-3904-4322-b43a-d5d55c780198 |
| name | gpu_flavor |
| os-flavor-access:is_public | True |
| properties | pci_passthrough:alias='geforce-gtx-1630:1' |
| ram | 4096 |
| rxtx_factor | 1.0 |
| swap | 0 |
| vcpus | 2 |
+----------------------------+--------------------------------------------+
以下是尝试启动实例时 nova 调度程序的日志:
$ journalctl -xeu [email protected] | grep -i "Nov 20 09:08:57"
Hint: You are currently not seeing messages from other users and the system.
Users in groups 'adm', 'systemd-journal' can see all messages.
Pass -q to turn off this notice.
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.scheduler.manager [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Starting to schedule for instances: ['41bdc44a-3e8c-475a-9e65-4331666abd75'] {{(pid=127748) select_destinations /opt/stack/nova/nova/scheduler/manager.py:175}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.scheduler.request_filter [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] compute_status_filter request filter added forbidden trait COMPUTE_STATUS_DISABLED {{(pid=127748) compute_status_filter /opt/stack/nova/nova/scheduler/request_filter.py:253}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.scheduler.request_filter [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Request filter 'compute_status_filter' took 0.0 seconds {{(pid=127748) wrapper /opt/stack/nova/nova/scheduler/request_filter.py:46}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.scheduler.request_filter [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Request filter 'accelerators_filter' took 0.0 seconds {{(pid=127748) wrapper /opt/stack/nova/nova/scheduler/request_filter.py:46}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.scheduler.request_filter [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Request filter 'remote_managed_ports_filter' took 0.0 seconds {{(pid=127748) wrapper /opt/stack/nova/nova/scheduler/request_filter.py:46}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.scheduler.request_filter [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] ephemeral_encryption_filter skipped {{(pid=127748) ephemeral_encryption_filter /opt/stack/nova/nova/scheduler/request_filter.py:410}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG oslo_concurrency.lockutils [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Acquiring lock "13ea0d9f-bae6-45df-9ee2-6fbc0a0080ad" by "nova.context.set_target_cell.<locals>.get_or_set_cached_cell_and_set_connections" {{(pid=127748) inner /opt/stack/data/venv/lib/python3.10/site-packages/oslo_concurrency/lockutils.py:404}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG oslo_concurrency.lockutils [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Lock "13ea0d9f-bae6-45df-9ee2-6fbc0a0080ad" acquired by "nova.context.set_target_cell.<locals>.get_or_set_cached_cell_and_set_connections" :: waited 0.000s {{(pid=127748) inner /opt/stack/data/venv/lib/python3.10/site-packages/oslo_concurrency/lockutils.py:409}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG oslo_concurrency.lockutils [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Lock "13ea0d9f-bae6-45df-9ee2-6fbc0a0080ad" "released" by "nova.context.set_target_cell.<locals>.get_or_set_cached_cell_and_set_connections" :: held 0.000s {{(pid=127748) inner /opt/stack/data/venv/lib/python3.10/site-packages/oslo_concurrency/lockutils.py:423}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG oslo_concurrency.lockutils [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Acquiring lock "13ea0d9f-bae6-45df-9ee2-6fbc0a0080ad" by "nova.context.set_target_cell.<locals>.get_or_set_cached_cell_and_set_connections" {{(pid=127748) inner /opt/stack/data/venv/lib/python3.10/site-packages/oslo_concurrency/lockutils.py:404}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG oslo_concurrency.lockutils [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Lock "13ea0d9f-bae6-45df-9ee2-6fbc0a0080ad" acquired by "nova.context.set_target_cell.<locals>.get_or_set_cached_cell_and_set_connections" :: waited 0.000s {{(pid=127748) inner /opt/stack/data/venv/lib/python3.10/site-packages/oslo_concurrency/lockutils.py:409}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG oslo_concurrency.lockutils [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Lock "13ea0d9f-bae6-45df-9ee2-6fbc0a0080ad" "released" by "nova.context.set_target_cell.<locals>.get_or_set_cached_cell_and_set_connections" :: held 0.000s {{(pid=127748) inner /opt/stack/data/venv/lib/python3.10/site-packages/oslo_concurrency/lockutils.py:423}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG oslo_concurrency.lockutils [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Acquiring lock "('itopenstack', 'itopenstack')" by "nova.scheduler.host_manager.HostState.update.<locals>._locked_update" {{(pid=127748) inner /opt/stack/data/venv/lib/python3.10/site-packages/oslo_concurrency/lockutils.py:404}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG oslo_concurrency.lockutils [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Lock "('itopenstack', 'itopenstack')" acquired by "nova.scheduler.host_manager.HostState.update.<locals>._locked_update" :: waited 0.000s {{(pid=127748) inner /opt/stack/data/venv/lib/python3.10/site-packages/oslo_concurrency/lockutils.py:409}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.scheduler.host_manager [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Update host state from compute node: ComputeNode(cpu_allocation_ratio=4.0,cpu_info='{"arch": "x86_64", "model": "Skylake-Client-noTSX-IBRS", "vendor": "Intel", "topology": {"cells": 1, "sockets": 1, "cores": 6, "threads": 1}, "features": ["smx", "abm", "mce", "sse4.2", "vmx", "lm", "msr", "mpx", "xtpr", "tm2", "ht", "fma", "pat", "de", "adx", "tsc", "tsc-deadline", "clflushopt", "est", "dtes64", "popcnt", "arch-capabilities", "apic", "pclmuldq", "tsc_adjust", "rsba", "acpi", "vme", "movbe", "md-clear", "bmi1", "avx", "pni", "f16c", "pse36", "xsavec", "pge", "xsaves", "cx16", "ss", "sse4.1", "cx8", "xgetbv1", "smep", "nx", "mtrr", "lahf_lm", "x2apic", "avx2", "pdpe1gb", "ds_cpl", "arat", "spec-ctrl", "cmov", "pcid", "xsaveopt", "fsgsbase", "invpcid", "pae", "ssbd", "sse2", "fxsr", "stibp", "bmi2", "rdtscp", "invtsc", "rdseed", "mmx", "pse", "monitor", "syscall", "xsave", "ds", "ssse3", "intel-pt", "smap", "pbe", "fpu", "3dnowprefetch", "erms", "aes", "rdrand", "tm", "sse", "pdcm", "mca", "clflush", "sep"]}',created_at=2023-11-20T06:55:11Z,current_workload=0,deleted=False,deleted_at=None,disk_allocation_ratio=1.0,disk_available_least=79,free_disk_gb=97,free_ram_mb=7292,host='itopenstack',host_ip=192.168.1.10,hypervisor_hostname='itopenstack',hypervisor_type='QEMU',hypervisor_version=6002000,id=1,local_gb=97,local_gb_used=0,mapped=1,memory_mb=7804,memory_mb_used=512,metrics='[]',numa_topology='{"nova_object.name": "NUMATopology", "nova_object.namespace": "nova", "nova_object.version": "1.2", "nova_object.data": {"cells": [{"nova_object.name": "NUMACell", "nova_object.namespace": "nova", "nova_object.version": "1.5", "nova_object.data": {"id": 0, "cpuset": [0, 1, 2, 3, 4, 5], "pcpuset": [0, 1, 2, 3, 4, 5], "memory": 7804, "cpu_usage": 0, "memory_usage": 0, "pinned_cpus": [], "siblings": [[2], [5], [4], [1], [0], [3]], "mempages": [{"nova_object.name": "NUMAPagesTopology", "nova_object.namespace": "nova", "nova_object.version": "1.1", "nova_object.data": {"size_kb": 4, "total": 1997843, "used": 0, "reserved": 0}, "nova_object.changes": ["used", "size_kb", "reserved", "total"]}, {"nova_object.name": "NUMAPagesTopology", "nova_object.namespace": "nova", "nova_object.version": "1.1", "nova_object.data": {"size_kb": 2048, "total": 0, "used": 0, "reserved": 0}, "nova_object.changes": ["used", "size_kb", "reserved", "total"]}, {"nova_object.name": "NUMAPagesTopology", "nova_object.namespace": "nova", "nova_object.version": "1.1", "nova_object.data": {"size_kb": 1048576, "total": 0, "used": 0, "reserved": 0}, "nova_object.changes": ["used", "size_kb", "reserved", "total"]}], "network_metadata": {"nova_object.name": "NetworkMetadata", "nova_object.namespace": "nova", "nova_object.version": "1.0", "nova_object.data": {"physnets": [], "tunneled": false}, "nova_object.changes": ["physnets", "tunneled"]}, "socket": 0}, "nova_object.changes": ["id", "memory_usage", "cpu_usage", "pcpuset", "socket", "siblings", "mempages", "network_metadata", "memory", "pinned_cpus", "cpuset"]}]}, "nova_object.changes": ["cells"]}',pci_device_pools=PciDevicePoolList,ram_allocation_ratio=1.0,running_vms=0,service_id=3,stats={failed_builds='0'},supported_hv_specs=[HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec,HVSpec],updated_at=2023-11-20T08:25:01Z,uuid=afec2be8-dc5c-41ef-8d6a-e1729719bece,vcpus=6,vcpus_used=0) {{(pid=127748) _locked_update /opt/stack/nova/nova/scheduler/host_manager.py:169}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.scheduler.host_manager [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Update host state with aggregates: [] {{(pid=127748) _locked_update /opt/stack/nova/nova/scheduler/host_manager.py:172}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.scheduler.host_manager [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Update host state with service dict: {'id': 3, 'uuid': 'fba4961f-4107-45bf-9408-7a2af793beb7', 'host': 'itopenstack', 'binary': 'nova-compute', 'topic': 'compute', 'report_count': 73, 'disabled': False, 'disabled_reason': None, 'last_seen_up': datetime.datetime(2023, 11, 20, 9, 7, 6, tzinfo=datetime.timezone.utc), 'forced_down': False, 'version': 66, 'created_at': datetime.datetime(2023, 11, 20, 6, 55, 11, tzinfo=datetime.timezone.utc), 'updated_at': datetime.datetime(2023, 11, 20, 9, 7, 6, tzinfo=datetime.timezone.utc), 'deleted_at': None, 'deleted': False} {{(pid=127748) _locked_update /opt/stack/nova/nova/scheduler/host_manager.py:175}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.scheduler.host_manager [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Update host state with instances: [] {{(pid=127748) _locked_update /opt/stack/nova/nova/scheduler/host_manager.py:178}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG oslo_concurrency.lockutils [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Lock "('itopenstack', 'itopenstack')" "released" by "nova.scheduler.host_manager.HostState.update.<locals>._locked_update" :: held 0.002s {{(pid=127748) inner /opt/stack/data/venv/lib/python3.10/site-packages/oslo_concurrency/lockutils.py:423}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.filters [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Starting with 1 host(s) {{(pid=127748) get_filtered_objects /opt/stack/nova/nova/filters.py:70}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.scheduler.filters [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] PciPassthroughFilter tries allocation candidate: {'allocations': {'afec2be8-dc5c-41ef-8d6a-e1729719bece': {'resources': {'DISK_GB': 25, 'MEMORY_MB': 4096, 'VCPU': 2}}}, 'mappings': {'': ['afec2be8-dc5c-41ef-8d6a-e1729719bece']}} {{(pid=127748) filter_candidates /opt/stack/nova/nova/scheduler/filters/__init__.py:77}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.pci.stats [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Not enough PCI devices left to satisfy request {{(pid=127748) _filter_pools /opt/stack/nova/nova/pci/stats.py:654}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.scheduler.filters [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] PciPassthroughFilter rejected allocation candidate: {'allocations': {'afec2be8-dc5c-41ef-8d6a-e1729719bece': {'resources': {'DISK_GB': 25, 'MEMORY_MB': 4096, 'VCPU': 2}}}, 'mappings': {'': ['afec2be8-dc5c-41ef-8d6a-e1729719bece']}} {{(pid=127748) filter_candidates /opt/stack/nova/nova/scheduler/filters/__init__.py:88}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.scheduler.filters.pci_passthrough_filter [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] (itopenstack, itopenstack) ram: 7292MB disk: 80896MB io_ops: 0 instances: 0, allocation_candidates: 0 doesn't have the required PCI devices (InstancePCIRequests(instance_uuid=<?>,requests=[InstancePCIRequest])) {{(pid=127748) host_passes /opt/stack/nova/nova/scheduler/filters/pci_passthrough_filter.py:68}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: INFO nova.filters [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Filter PciPassthroughFilter returned 0 hosts
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.filters [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Filtering removed all hosts for the request with instance ID '41bdc44a-3e8c-475a-9e65-4331666abd75'. Filter results: [('PciPassthroughFilter', None)] {{(pid=127748) get_filtered_objects /opt/stack/nova/nova/filters.py:114}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: INFO nova.filters [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Filtering removed all hosts for the request with instance ID '41bdc44a-3e8c-475a-9e65-4331666abd75'. Filter results: ['PciPassthroughFilter: (start: 1, end: 0)']
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.scheduler.manager [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] Filtered [] {{(pid=127748) _get_sorted_hosts /opt/stack/nova/nova/scheduler/manager.py:708}}
Nov 20 09:08:57 itopenstack nova-scheduler[127748]: DEBUG nova.scheduler.manager [None req-20b0843e-0997-4c83-9abc-b20fd73706a4 demo admin] There are 0 hosts available but 1 instances requested to build. {{(pid=127748) _ensure_sufficient_hosts /opt/stack/nova/nova/scheduler/manager.py:527}}
我不明白错误从何而来。这是我在 StackExchange 上的第一篇文章,我试图提供尽可能多的详细信息。如果我需要提供任何其他详细信息,请告诉我。我将不胜感激任何建议,谢谢。