在 openstack KVM 客户虚拟机上使用带有 HWE 的 22.04 Jammy 会触发以下 /var/log/kern.log 消息:
------------[ cut here ]------------
XSAVE consistency problem, dumping leaves
WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:606 paranoid_xstate_size_valid+0x138/0x158
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.19.0-35-generic #36~22.04.1-Ubuntu
RIP: 0010:paranoid_xstate_size_valid+0x138/0x158
Code: 10 41 0f b6 f4 48 c7 c7 70 b5 44 a8 e8 24 11 31 fe 41 80 e4 01 75 15 48 c7 c7 f8 02 be a7 c6 05 a4 0c f4 ff 01 e8 93 45 9f fe <0f> 0b e8 78 f7 ff ff 44 39 eb 0f 94 c0 5b 41 5c 41 5d 41 5e 41 5f
RSP: 0000:ffffffffa8403db8 EFLAGS: 00010046 ORIG_RAX: 0000000000000000
RAX: 0000000000000000 RBX: 0000000000000a00 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffffffa8403de0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000007
FS: 0000000000000000(0000) GS:ffffffffa8811000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff88800008b000 CR3: 00000002566d6000 CR4: 00000000000406a0
Call Trace:
<TASK>
fpu__init_system_xstate+0x41f/0x607
fpu__init_system+0x155/0x196
early_identify_cpu.constprop.0+0xf8/0x130
early_cpu_init+0x90/0xa3
setup_arch+0x49/0x8ab
start_kernel+0x6c/0x4e4
x86_64_start_reservations+0x24/0x2c
x86_64_start_kernel+0xee/0x103
secondary_startup_64_no_verify+0xe5/0xeb
</TASK>
---[ end trace 0000000000000000 ]---
因此,内核中未启用 avx 功能,导致一些需要该功能的程序无法运行(例如mongodb 需要 avx 指令集)。禁用 HWE 并降级到 GA 内核 (5.15.110-0515110-generic) 时不会重现此问题。
我怀疑 5.19 内核中存在回归问题,该问题可能对 QEMU 模拟的 XASVE CPUID 功能更加严格。到目前为止,已执行完整诊断https://github.com/orange-cloudfoundry/paas-templates/issues/1960#issuecomment-1534484042
我应该打开一个错误https://bugs.launchpad.net/ubuntu/+source/linux-meta-hwe-5.19让 Ubuntu 内核维护者帮助我收集进一步的症状?
注意:我无法直接访问 KVM 虚拟机管理程序主机