Linux 系统彻底冻结“CPU 卡住 22 秒”/硬件错误?

Linux 系统彻底冻结“CPU 卡住 22 秒”/硬件错误?

我的 ryzen 5 电脑经常出现死机问题。不同版本和内核都存在此问题。

冻结大多发生在尝试暂停系统以进入内存时,但显示设置的变化(例如屏幕分辨率或禁用显示器)也会导致冻结。关闭 PC 时 CPU 还会报告卡住的情况。

当由于显示变化而导致冻结时,计算机通常会冻结几秒钟,但过一段时间后会再次开始工作。

当我因挂起而冻结时,计算机会完全重新启动,并且在启动时我会收到与图片相关的错误消息。 暂停冻结

像这样:

[Hardware Error]: CPU 5: Machine Check: 0 Bank 5: bea0000000

奇怪的是,这种情况似乎是随机发生的。当系统刚启动时,我没有遇到这些问题。

这是我从挂起到内存冻结并完全重启时获取的内核日志:

-- Journal begins at Wed 2020-12-30 15:47:51 CET, ends at Wed 2021-03-10 12:16:36 CET. --
Mär 09 21:51:56 pwrpc kernel: audit: type=1130 audit(1615323116.616:528): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mär 09 21:51:56 pwrpc kernel: wlan0: deauthenticating from 7c:ff:4d:55:f7:49 by local choice (Reason: 3=DEAUTH_LEAVING)
Mär 09 21:51:56 pwrpc kernel: rtw_8822be 0000:04:00.0: sta 7c:ff:4d:55:f7:49 with macid 0 left
Mär 09 21:51:56 pwrpc kernel: rtw_8822be 0000:04:00.0: stop vif 80:91:33:7e:93:e5 on port 0
Mär 09 21:51:57 pwrpc kernel: rtw_8822be 0000:04:00.0: start vif 7a:ea:57:19:ee:68 on port 0
Mär 09 21:51:57 pwrpc kernel: rtw_8822be 0000:04:00.0: stop vif 7a:ea:57:19:ee:68 on port 0
Mär 09 21:51:57 pwrpc kernel: rtw_8822be 0000:04:00.0: start vif 80:91:33:7e:93:e5 on port 0
Mär 09 21:51:57 pwrpc kernel: PM: suspend entry (deep)

这是显示变化时发生的情况(内核日志):

-- Journal begins at Wed 2020-12-30 15:47:51 CET, ends at Tue 2021-03-09 19:35:28 CET. --
Mär 09 19:32:01 pwrpc kernel: amdgpu: 
                                last message was failed ret is 0
Mär 09 19:32:04 pwrpc kernel: amdgpu: 
                                failed to send message 145 ret is 0 
Mär 09 19:32:09 pwrpc kernel: amdgpu: 
                                last message was failed ret is 0
Mär 09 19:32:12 pwrpc kernel: amdgpu: 
                                failed to send message 146 ret is 0 
Mär 09 19:32:15 pwrpc kernel: amdgpu: 
                                last message was failed ret is 0
Mär 09 19:32:17 pwrpc kernel: amdgpu: 
                                failed to send message 148 ret is 0 
Mär 09 19:32:23 pwrpc kernel: amdgpu: 
                                last message was failed ret is 0
Mär 09 19:32:25 pwrpc kernel: watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [Xorg:3367]
Mär 09 19:32:25 pwrpc kernel: Modules linked in: rndis_host cdc_ether usbnet mii tun ccm mousedev joydev btusb btrtl btbcm btintel bluetooth uas ecdh_generic usb_storage usbhid ecc dm_mod snd_hda_codec_realtek ext4 snd_hda_codec_generic rtw88_8822be rtw88_8822b ledtrig_audio rtw88_pci snd_hda_codec_hdmi rtw88_core nls_iso8859_1 nls_cp437 vfat snd_hda_intel edac_mce_amd fat snd_intel_dspcfg crc16 mbcache soundwire_intel eeepc_wmi jbd2 soundwire_generic_allocation asus_wmi kvm_amd soundwire_cadence sparse_keymap amdgpu mac80211 video wmi_bmof snd_hda_codec kvm snd_hda_core snd_hwdep soundwire_bus snd_soc_core gpu_sched irqbypass ttm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel cfg80211 drm_kms_helper snd_compress crypto_simd cryptd ac97_bus glue_helper k10temp snd_pcm_dmaengine rapl snd_pcm igb cec snd_timer snd syscopyarea pcspkr sp5100_tco sysfillrect ccp sysimgblt fb_sys_fops i2c_piix4 soundcore i2c_algo_bit rfkill dca rng_core libarc4 wmi gpio_amdpt pinctrl_amd gpio_generic mac_hid
Mär 09 19:32:25 pwrpc kernel:  acpi_cpufreq vboxnetflt(OE) vboxnetadp(OE) vboxdrv(OE) drm pkcs8_key_parser fuse sg agpgart bpf_preload ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq crc32c_intel xhci_pci xhci_pci_renesas
Mär 09 19:32:25 pwrpc kernel: CPU: 7 PID: 3367 Comm: Xorg Tainted: G        W  OEL    5.10.20-1-lts #1
Mär 09 19:32:25 pwrpc kernel: Hardware name: System manufacturer System Product Name/ROG STRIX B450-I GAMING, BIOS 3103 06/17/2020
Mär 09 19:32:25 pwrpc kernel: RIP: 0010:delay_halt_mwaitx+0x1d/0x40
Mär 09 19:32:25 pwrpc kernel: Code: 0f ae f1 c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 48 c7 c0 00 60 00 00 65 48 03 05 13 3d cf 5f 31 d2 48 89 d1 0f 01 fa <bb> ff ff ff ff b8 f0 00 00 00 b9 02 00 00 00 48 39 de 48 0f 46 de
Mär 09 19:32:25 pwrpc kernel: RSP: 0018:ffffb0ba0084f4f0 EFLAGS: 00000246
Mär 09 19:32:25 pwrpc kernel: RAX: ffff93e15ebc6000 RBX: 00004f8187de28e8 RCX: 0000000000000000
Mär 09 19:32:25 pwrpc kernel: RDX: 0000000000000000 RSI: 0000000000000e10 RDI: 00004f8187de28e8
Mär 09 19:32:25 pwrpc kernel: RBP: 0000000000000e10 R08: 0000000000000000 R09: ffffb0ba0084f370
Mär 09 19:32:25 pwrpc kernel: R10: ffffb0ba0084f368 R11: ffff93e17f32b268 R12: 0000000000000095
Mär 09 19:32:25 pwrpc kernel: R13: 0000000000000000 R14: 000000000000ffff R15: ffff93da51bce88c
Mär 09 19:32:25 pwrpc kernel: FS:  00007f2ac269f940(0000) GS:ffff93e15ebc0000(0000) knlGS:0000000000000000
Mär 09 19:32:25 pwrpc kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mär 09 19:32:25 pwrpc kernel: CR2: 000038cf6a17a000 CR3: 0000000129c9e000 CR4: 0000000000350ee0
Mär 09 19:32:25 pwrpc kernel: Call Trace:

总结 简而言之,冻结是由不同原因引起的,通常会提到 CPU 核心冻结。

知道这可能是什么吗?我这里缺少一些内核模块吗?或者这可能是与硬件相关的缺陷?

此致

相关内容