Ubuntu 22.04.1 崩溃,终端上显示消息“rcu_preempt 检测到 CPU 上的加速停顿”

Ubuntu 22.04.1 崩溃,终端上显示消息“rcu_preempt 检测到 CPU 上的加速停顿”

我最近在旧电脑上安装了 Ubuntu 22.04.1。但它每天都会崩溃几次。起初我以为是主板不兼容,所以我将主板从 GIGABYTE B450M Aorus elite 换成了 MSI B550M PRO-VDH WIFI,然后重新安装了 Linux。它确实有所改善,因为使用旧主板我甚至无法安装 Ubuntu 并切换到 Debian,现在每天崩溃的次数减少了。

顺便说一句,这台电脑在 Windows 11 上运行得很好。已经运行了 4 次 MEMTEST86 并通过了测试。

以下是来自dmesghttps://www.mediafire.com/file/e4yw64m63o16urk/dmesg.txt/file

硬件探测:https://linux-hardware.org/?probe=9b7f62afeb

以下是其中的一部分dmesg

[  460.620427] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [argocd-applicat:10862]
[  460.621697] Modules linked in: ipt_REJECT nf_reject_ipv4 xt_multiport xt_mark xt_comment rfcomm veth xt_nat xt_tcpudp xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo xt_addrtype nft_compat nf_tables libcrc32c nfnetlink br_netfilter bridge stp llc cmac algif_hash algif_skcipher af_alg bnep overlay binfmt_misc nvidia_uvm(POE) intel_rapl_msr intel_rapl_common edac_mce_amd nvidia_drm(POE) nvidia_modeset(POE) kvm irqbypass crct10dif_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel snd_hda_codec_realtek crypto_simd snd_hda_codec_generic cryptd iwlmvm snd_hda_codec_hdmi ledtrig_audio nvidia(POE) snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi mac80211 snd_hda_codec snd_hda_core snd_hwdep libarc4 snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi btusb iwlwifi btrtl btbcm snd_seq btintel btmtk snd_seq_device nls_iso8859_1 rapl wmi_bmof snd_timer bluetooth input_leds k10temp cfg80211 ccp snd
[  460.621798]  joydev drm_kms_helper ecdh_generic soundcore ecc video mac_hid sch_fq_codel msr parport_pc ppdev lp parport drm efi_pstore ip_tables x_tables autofs4 hid_logitech_hidpp hid_logitech_dj hid_generic usbhid hid ucsi_ccg typec_ucsi typec nvme crc32_pclmul r8169 nvme_core i2c_piix4 i2c_nvidia_gpu ahci i2c_ccgx_ucsi xhci_pci realtek libahci xhci_pci_renesas nvme_common wmi gpio_amdpt
[  460.621847] CPU: 1 PID: 10862 Comm: argocd-applicat Tainted: P           OE      6.5.0-27-generic #28~22.04.1-Ubuntu
[  460.621850] Hardware name: Micro-Star International Co., Ltd. MS-7C95/B550M PRO-VDH WIFI (MS-7C95), BIOS 2.D0 12/12/2022
[  460.621852] RIP: 0010:native_queued_spin_lock_slowpath+0x26e/0x300
[  460.621859] Code: 81 c5 00 3f 03 00 49 81 ff ff 1f 00 00 0f 87 91 00 00 00 4e 03 2c fd 40 5b c9 a6 4d 89 65 00 41 8b 44 24 08 85 c0 75 0b f3 90 <41> 8b 44 24 08 85 c0 74 f5 49 8b 14 24 48 85 d2 74 05 0f 0d 0a eb
[  460.621862] RSP: 0018:ffffb7695489bcd8 EFLAGS: 00000246
[  460.621864] RAX: 0000000000000000 RBX: ffffa0dd64062308 RCX: 0000000000000000
[  460.621867] RDX: 00000000000005ff RSI: 0000000017fdefd9 RDI: ffffa0dd64062308
[  460.621868] RBP: ffffb7695489bd00 R08: 0000000000000000 R09: 0000000000000000
[  460.621870] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa0e34ea73f00
[  460.621872] R13: ffffffffa787cf20 R14: 0000000000080000 R15: 00000000000005fe
[  460.621874] FS:  000000c000100090(0000) GS:ffffa0e34ea40000(0000) knlGS:0000000000000000
[  460.621876] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  460.621878] CR2: 000000c00035f6e0 CR3: 00000001f210c000 CR4: 0000000000350ee0
[  460.621880] Call Trace:
[  460.621882]  <IRQ>
[  460.621886]  ? show_regs+0x6d/0x80
[  460.621891]  ? watchdog_timer_fn+0x1d8/0x240
[  460.621895]  ? __pfx_watchdog_timer_fn+0x10/0x10
[  460.621898]  ? __hrtimer_run_queues+0x112/0x2a0
[  460.621902]  ? srso_return_thunk+0x5/0x10
[  460.621906]  ? hrtimer_interrupt+0xf6/0x250
[  460.621911]  ? __sysvec_apic_timer_interrupt+0x62/0x140
[  460.621916]  ? sysvec_apic_timer_interrupt+0x8d/0xd0
[  460.621919]  </IRQ>
[  460.621921]  <TASK>
[  460.621923]  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
[  460.621931]  ? native_queued_spin_lock_slowpath+0x26e/0x300
[  460.621936]  _raw_spin_lock+0x3f/0x60
[  460.621940]  __mutex_lock.constprop.0+0x261/0x7a0
[  460.621942]  ? srso_return_thunk+0x5/0x10
[  460.621945]  ? _raw_spin_unlock_bh+0x1d/0x30
[  460.621950]  __mutex_lock_slowpath+0x13/0x20
[  460.621953]  mutex_lock+0x3c/0x50
[  460.621956]  do_epoll_ctl+0x30d/0x860
[  460.621961]  ? inet_dgram_connect+0x42/0xe0
[  460.621967]  __x64_sys_epoll_ctl+0x6e/0xb0
[  460.621971]  do_syscall_64+0x5b/0x90
[  460.621977]  ? srso_return_thunk+0x5/0x10
[  460.621979]  ? exit_to_user_mode_prepare+0x30/0xb0
[  460.621982]  ? srso_return_thunk+0x5/0x10
[  460.621985]  ? syscall_exit_to_user_mode+0x37/0x60
[  460.621987]  ? srso_return_thunk+0x5/0x10
[  460.621989]  ? do_syscall_64+0x67/0x90
[  460.621992]  ? do_syscall_64+0x67/0x90
[  460.621995]  ? sysvec_apic_timer_interrupt+0x4b/0xd0
[  460.621999]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[  460.622002] RIP: 0033:0x40720e
[  460.622023] Code: 48 83 ec 38 e8 13 00 00 00 48 83 c4 38 5d c3 cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48
[  460.622026] RSP: 002b:000000c0034cba90 EFLAGS: 00000246 ORIG_RAX: 00000000000000e9
[  460.622029] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 000000000040720e
[  460.622030] RDX: 000000000000000b RSI: 0000000000000001 RDI: 0000000000000003
[  460.622032] RBP: 000000c0034cbae8 R08: 0000000000000000 R09: 0000000000000000
[  460.622034] R10: 000000c0034cbad4 R11: 0000000000000246 R12: 0000000000000000
[  460.622035] R13: 000000c000100000 R14: 000000c00277e9c0 R15: 000000000000006c

提前致谢。

答案1

高度怀疑是硬件问题,因为每次崩溃后打印的堆栈都不相同,所以可能不是软件问题。

经过对 BIOS 硬件设置进行一些调整和实验后,发现是 CPU ryzen 3600X 的问题。默认情况下,它以 3.8Ghz 的基本频率工作。但在 3.8Ghz 下运行 Linux 时似乎非常不稳定,即使在手动提升风扇并关闭 CPU 提升设置后也是如此。

最后,将基本频率设置为 3.6Ghz(就像它的兄弟 ryzen 3600 一样)后,它运行顺畅,系统再也没有崩溃过。顺便说一下,Precision Boost Overdrive 无需关闭,风扇设置保持默认设置。

相关内容