CPU 在 Debian Bullseye 中停止

CPU 在 Debian Bullseye 中停止

我的开发机器经常面临CPU卡顿的情况。

这种错误是随机发生的,在启动时更常见,但任何时候都可能发生,特别是当 CPU 负载较大时。

您可以在此处看到dmesg此类错误的一个。

[    6.239299] virtio_ring: module verification failed: signature and/or required key missing - tainting kernel
[   27.246141] rcu: INFO: rcu_preempt self-detected stall on CPU
[   27.247073] rcu:     2-...!: (5250 ticks this GP) idle=693/1/0x4000000000000000 softirq=403/403 fqs=15 
[   27.248277]  (t=5251 jiffies g=381 q=2544)
[   27.248290] rcu: rcu_preempt kthread starved for 5014 jiffies! g381 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=3
[   27.249569] rcu:     Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[   27.250733] rcu: RCU grace-period kthread stack dump:
[   27.251460] task:rcu_preempt     state:R  running task     stack:    0 pid:   14 ppid:     2 flags:0x00004000
[   27.251506] Call Trace:
[   27.251526]  <TASK>
[   27.251588]  __schedule+0x302/0x9b0
[   27.251690]  schedule+0x4e/0xc0
[   27.251692]  schedule_timeout+0x88/0x150
[   27.251711]  ? __bpf_trace_tick_stop+0x10/0x10
[   27.251721]  rcu_gp_fqs_loop+0xfc/0x380
[   27.251769]  ? rcu_gp_init+0x550/0x550
[   27.251771]  rcu_gp_kthread+0xa7/0x130
[   27.251773]  kthread+0x16b/0x190
[   27.251797]  ? set_kthread_struct+0x40/0x40
[   27.251799]  ret_from_fork+0x22/0x30
[   27.251840]  </TASK>
[   27.251842] rcu: Stack dump where RCU GP kthread last ran:
[   27.252622] Sending NMI from CPU 2 to CPUs 3:
[   27.252660] NMI backtrace for cpu 3
[   27.252662] CPU: 3 PID: 132 Comm: systemd-udevd Tainted: G            E     5.16.0snappy-dirty #15
[   27.252664] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[   27.252665] RIP: 0010:strnlen+0x17/0x30
[   27.252685] Code: 38 00 75 f7 48 29 f8 c3 31 c0 c3 0f 1f 84 00 00 00 00 00 48 8d 14 37 48 89 f8 48 85 f6 75 0b eb 19 48 83 c0 01 48 39 c2 74 09 <80> 38 00 75 f2 48 29 f8 c3 48 89 d0 48 29 f8 c3 31 c0 c3 66 0f 1f
[   27.252686] RSP: 0018:ffffc90000463d38 EFLAGS: 00010202
[   27.252688] RAX: ffffffffc0020298 RBX: 0000000000000007 RCX: ffffc9000079d3d0
[   27.252689] RDX: ffffffffc00202d0 RSI: 0000000000000038 RDI: ffffffffc0020298
[   27.252690] RBP: 0000000000000001 R08: ffffffffc0039de8 R09: ffffffffc0037000
[   27.252691] R10: 0000000000000000 R11: 0000000000000003 R12: ffffffffc0020280
[   27.252691] R13: ffffffffc00341d8 R14: ffffffffc0020298 R15: 000000000000002d
[   27.252692] FS:  00007f10066628c0(0000) GS:ffff88817bd80000(0000) knlGS:0000000000000000
[   27.252693] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   27.252694] CR2: 00007f100665acd7 CR3: 000000017bcd0002 CR4: 0000000000370ee0
[   27.252756] Call Trace:
[   27.252794]  <TASK>
[   27.252794]  find_module_all+0x59/0xa0
[   32.217835] watchdog: BUG: soft lockup - CPU#0 stuck for 27s! [systemd-udevd:134]
[   32.218993] Modules linked in: scsi_common(E) i2c_smbus(E)
[   32.221866] watchdog: BUG: soft lockup - CPU#1 stuck for 27s! [systemd-udevd:145]
[   32.223178] Modules linked in: scsi_common(E) i2c_smbus(E) virtio(E+)
[  246.844676]  virtio(E+)
[  246.844967]  virtio_ring(E)
[  246.845011]  virtio_ring(E)
[  246.845161] CPU: 0 PID: 134 Comm: systemd-udevd Tainted: G            E     5.16.0snappy-dirty #15
[  246.845162] CPU: 1 PID: 145 Comm: systemd-udevd Tainted: G            E     5.16.0snappy-dirty #15
[  246.845225] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[  246.845228] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[  246.845135]  load_module+0xe08/0x2720
[  246.845322] RIP: 0010:sysfs_kf_seq_show+0x19/0xf0
[  246.845325] RIP: 0010:sysfs_kf_seq_show+0x19/0xf0
[  246.845333]  ? kernel_read_file_from_fd+0x51/0x90
[  246.845365]  __do_sys_finit_module+0xae/0x110
[  246.845387] Code: 5d 41 5c c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 55 53 48 8b 77 70 48 89 fb 48 8b 06 48 8b 40 08 4c 8b 40 60 <49> 8b 68 28 48 85 ed 74 04 48 8b 6d 08 48 83 7d 00 00 0f 84 aa 00
[  246.845389] Code: 5d 41 5c c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 55 53 48 8b 77 70 48 89 fb 48 8b 06 48 8b 40 08 4c 8b 40 60 <49> 8b 68 28 48 85 ed 74 04 48 8b 6d 08 48 83 7d 00 00 0f 84 aa 00
[  246.845390] RSP: 0018:ffffc900004c3dd8 EFLAGS: 00010203
[  246.845391] RSP: 0018:ffffc90000473dd8 EFLAGS: 00010203
[  246.845392] RAX: ffff88814c42b500 RBX: ffff8881537de258 RCX: 0000000000000001
[  246.845398] RDX: ffff88810035c300 RSI: ffff88810035c300 RDI: ffff8881537de258
[  246.845393] RAX: ffff88814c42b500 RBX: ffff888161c9d690 RCX: 0000000000000001
[  246.845399] RBP: 0000000000000000 R08: ffffffffc00202d0 R09: ffff888161c0c000
[  246.845368]  do_syscall_64+0x3b/0xc0
[  246.845400] RDX: ffff8881008f2900 RSI: ffff8881008f2900 RDI: ffff888161c9d690
[  246.845401] R10: 0000000000020000 R11: 0000000000000000 R12: ffffc900004c3e80
[  246.845401] RBP: 0000000000000000 R08: ffffffffc00202d0 R09: ffff8881008bc000
[  246.845402] R10: 0000000000020000 R11: 0000000000000000 R12: ffffc90000473e80
[  246.845402] R13: ffffc900004c3e58 R14: ffff8881537de280 R15: 0000000000000001
[  246.845403] R13: ffffc90000473e58 R14: ffff888161c9d6b8 R15: 0000000000000001
[  246.845411] FS:  00007f10066628c0(0000) GS:ffff88817bc00000(0000) knlGS:0000000000000000
[  246.845411] FS:  00007f10066628c0(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000
[  246.845413] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  246.845413] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  246.845414] CR2: 00007f1006661b0f CR3: 000000017bce6004 CR4: 0000000000370ee0
[  246.845414] CR2: 00007f1006661b0f CR3: 000000017bcd4006 CR4: 0000000000370ef0
[  246.845419]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  246.845471] Call Trace:
[  246.845472] Call Trace:
[  246.845471] RIP: 0033:0x7f1006b1b9b9
[  246.845529]  <TASK>
[  246.845530]  <TASK>
[  246.845540]  seq_read_iter+0x11c/0x450
[  246.845541]  seq_read_iter+0x11c/0x450
[  246.845575]  new_sync_read+0x118/0x1a0
[  246.845575]  new_sync_read+0x118/0x1a0
[  246.845597] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a7 54 0c 00 f7 d8 64 89 01 48
[  246.845599] RSP: 002b:00007ffe5105cc18 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  246.845601] RAX: ffffffffffffffda RBX: 000055d52bbf3ea0 RCX: 00007f1006b1b9b9
[  246.845602] RDX: 0000000000000000 RSI: 00007f1006ca6e2d RDI: 0000000000000005
[  246.845603] RBP: 0000000000020000 R08: 0000000000000000 R09: 000055d52bbe0290
[  246.845604] R10: 0000000000000005 R11: 0000000000000246 R12: 00007f1006ca6e2d
[  246.845604] R13: 0000000000000000 R14: 000055d52bbdeee0 R15: 000055d52bbf3ea0
[  246.845604]  vfs_read+0xf2/0x190
[  246.845604]  vfs_read+0xf2/0x190
[  246.845606]  </TASK>
[  246.845619]  ksys_read+0x5f/0xe0
[  246.845619]  ksys_read+0x5f/0xe0
[  246.845622]  do_syscall_64+0x3b/0xc0
[  246.845622]  do_syscall_64+0x3b/0xc0
[  246.845624]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  246.845630]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  246.845633] RIP: 0033:0x7f1006bfb04e
[  246.845634] RIP: 0033:0x7f1006bfb04e
[  246.845648] Code: 0f 1f 40 00 48 8b 15 79 9f 00 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb ba 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
[  246.845649] Code: 0f 1f 40 00 48 8b 15 79 9f 00 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb ba 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
[  246.845650] RSP: 002b:00007ffe5105bb28 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[  246.845652] RSP: 002b:00007ffe5105bb28 EFLAGS: 00000246
[  246.845653] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1006bfb04e
[  246.845653]  ORIG_RAX: 0000000000000000
[  246.845654] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1006bfb04e
[  246.845654] RDX: 000000000000001f RSI: 00007ffe5105bbf0 RDI: 000000000000000c
[  246.845655] RBP: 00007ffe5105bbf0 R08: 00000000ffffffff R09: 00007ffe5105b9f0
[  246.845655] RDX: 000000000000001f RSI: 00007ffe5105bbf0 RDI: 0000000000000005
[  246.845656] RBP: 00007ffe5105bbf0 R08: 00000000ffffffff R09: 00007ffe5105b9f0
[  246.845656] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
[  246.845657] R13: 00007ffe5105bbf0 R14: 000000000000001f R15: 00007ffe5105bbf0
[  246.845658] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000005
[  246.845659] R13: 00007ffe5105bbf0 R14: 000000000000001f R15: 00007ffe5105bbf0
[  246.845660]  </TASK>
[  246.845661]  </TASK>
[  246.845664] NMI backtrace for cpu 2
[  246.845674] CPU: 2 PID: 135 Comm: systemd-udevd Tainted: G            EL    5.16.0snappy-dirty #15
[  246.845677] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[  246.845686] Call Trace:
[  246.845689] INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 219592.992 msecs
[  246.845783]  <IRQ>
[  246.845801]  dump_stack_lvl+0x48/0x5e
[  246.845825]  nmi_cpu_backtrace.cold+0x30/0x77
[  246.845828]  ? lapic_can_unplug_cpu+0x80/0x80
[  246.845830]  nmi_trigger_cpumask_backtrace+0x104/0x130
[  246.845869]  rcu_dump_cpu_stacks+0xd8/0x100
[  246.845917]  rcu_sched_clock_irq.cold+0x60/0x2f3
[  246.845966]  ? sched_slice+0x74/0x130
[  246.845974]  ? perf_event_task_tick+0x6c/0x3d0
[  246.845994]  update_process_times+0x93/0xc0
[  246.846017]  tick_sched_handle+0x22/0x60
[  246.846039]  tick_sched_timer+0x84/0xb0
[  246.846041]  ? can_stop_idle_tick+0xd0/0xd0
[  246.846043]  __hrtimer_run_queues+0x12a/0x2c0
[  246.846053]  hrtimer_interrupt+0x106/0x220
[  246.846062]  __sysvec_apic_timer_interrupt+0x7f/0x160
[  246.846082]  sysvec_apic_timer_interrupt+0x9d/0xd0
[  246.846100]  </IRQ>
[  246.846101]  <TASK>
[  246.846101]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[  246.846143] RIP: 0010:bus_register+0x45/0x260
[  246.846186] Code: 4b b7 00 53 e8 dc 49 c8 ff 48 85 c0 0f 84 22 02 00 00 48 89 a8 28 01 00 00 48 89 c3 48 8d b8 f0 00 00 00 48 c7 c2 d0 02 0a 83 <48> 89 85 a0 00 00 00 48 c7 c6 ed 2e 17 82 4c 8d 6b 18 e8 d4 69 ab
[  246.846188] RSP: 0018:ffffc9000047bdc8 EFLAGS: 00010286
[  246.846189] RAX: ffff888101784200 RBX: ffff888101784200 RCX: 0000000000000000
[  246.846190] RDX: ffffffff830a02d0 RSI: ffffffff81639854 RDI: ffff8881017842f0
[  246.846191] RBP: ffffffffc0020000 R08: 0000000000000200 R09: ffff888101784200
[  246.846192] R10: ffff88817bc3c100 R11: 0000000000000000 R12: ffff88817bc3aba0
[  246.846193] R13: ffffc9000047be88 R14: 0000000000000005 R15: 0000000000000000
[  246.846195]  ? bus_register+0x24/0x260
[  246.846238]  ? bus_register+0x24/0x260
[  246.846241]  ? virtio_dev_probe+0x240/0x240 [virtio]
[  246.846278]  virtio_init+0x11/0x1c [virtio]
[  246.846303]  do_one_initcall+0x44/0x200
[  246.846361]  ? kmem_cache_alloc_trace+0x2f1/0x3f0
[  246.846369]  do_init_module+0x5c/0x240
[  246.846372]  __do_sys_finit_module+0xae/0x110
[  246.846374]  do_syscall_64+0x3b/0xc0
[  246.846376]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  246.846379] RIP: 0033:0x7f1006b1b9b9
[  246.846395] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a7 54 0c 00 f7 d8 64 89 01 48
[  246.846397] RSP: 002b:00007ffe5105cbf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  246.846399] RAX: ffffffffffffffda RBX: 000055d52bbdeee0 RCX: 00007f1006b1b9b9
[  246.846400] RDX: 0000000000000000 RSI: 00007f1006ca6e2d RDI: 0000000000000005
[  246.846401] RBP: 0000000000020000 R08: 0000000000000000 R09: 000055d52bbdeee0
[  246.846402] R10: 0000000000000005 R11: 0000000000000246 R12: 00007f1006ca6e2d
[  246.846403] R13: 0000000000000000 R14: 000055d52bbf8540 R15: 000055d52bbdeee0
[  246.846405]  </TASK>

如果我让机器保持活动状态,它通常会在很长一段时间后设法“恢复”。这次是4分钟后。这可能是由于内核超时造成的。

造成此类问题的原因是什么?我的机器是VMware Workstation VM(2级虚拟化)内的KVM VM。它运行 Debian bullseye。我显示此消息的虚拟机使用自定义 5.16 内核,但普通内核也发生了相同的错误。

相关内容