我买了一台新的迷你电脑作为我的自托管服务器,有时它不会崩溃就会挂起。
这意味着我仍然可以 ssh,但容器/docker 等没有响应。sudo
reboot 也几乎挂起,因为它会一直等待以终止某些东西,所以我所能做的就是物理重启机器。
以下是 journalctl 在上次发生此情况时所说的话:
kernel: general protection fault, probably for non-canonical address 0xffdfd5e688d2e150: 0000 [#1] SMP NOPTI
kernel: CPU: 3 PID: 651408 Comm: 5 Not tainted 5.15.0-46-generic #49-Ubuntu
kernel: Hardware name: AZW Gemini M/Gemini M, BIOS 5.13 11/13/2020
kernel: RIP: 0010:rmqueue+0x44a/0xbb0
kernel: Code: 04 49 8b 46 10 49 01 db 49 39 c3 0f 84 33 01 00 00 48 be 00 01 00 00 00 00 ad de 49 8b 46 10 48 8b 08 48 8b 50 08 4c 8d 40 f8 <48> 89 51 08 48 89 0a 48 89 30 48 83 c6 22 48 89 70 08 41 83 fd 1f
kernel: RSP: 0000:ffffabf78101bb60 EFLAGS: 00010097
kernel: RAX: ffffd5e688c3b2c8 RBX: ffff9b85f7db6580 RCX: ffdfd5e688d2e148
kernel: RDX: ffff9b85f7db65a0 RSI: dead000000000100 RDI: ffff9b85fffd6b80
kernel: RBP: ffffabf78101bc30 R08: ffffd5e688c3b2c0 R09: 0000000000000001
kernel: R10: 0000000000000293 R11: ffff9b85f7db65a0 R12: ffff9b85fffd6b80
kernel: R13: 0000000000000000 R14: ffff9b85f7db6590 R15: 000000000002d588
kernel: FS: 00007f1eacd0c740(0000) GS:ffff9b85f7d80000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 000055d38ad66000 CR3: 000000010eb0e000 CR4: 0000000000350ee0
kernel: Call Trace:
kernel: <TASK>
kernel: ? __mod_memcg_lruvec_state+0x63/0xe0
kernel: ? xas_load+0x17/0xd0
kernel: get_page_from_freelist+0xd1/0x520
kernel: __alloc_pages+0x17e/0x330
kernel: alloc_pages_vma+0x9d/0x390
kernel: do_fault+0x69/0x2e0
kernel: handle_pte_fault+0x1cd/0x240
kernel: __handle_mm_fault+0x3c7/0x700
kernel: handle_mm_fault+0xd8/0x2c0
kernel: do_user_addr_fault+0x1c9/0x670
kernel: exc_page_fault+0x77/0x170
kernel: asm_exc_page_fault+0x26/0x30
kernel: RIP: 0033:0x7f1eacf75340
kernel: Code: 00 00 49 8b 14 24 41 8b 44 24 08 4c 01 d2 48 83 f8 26 74 0a 48 83 f8 08 0f 85 ca 1a 00 00 49 8b 44 24 10 49 83 c4 18 4c 01 d0 <48> 89 02 4c 39 e3 77 d0 49 8b 83 d0 01 00 00 48 89 85 60 ff ff ff
kernel: RSP: 002b:00007ffed18130a0 EFLAGS: 00010206
kernel: RAX: 000055d38a8d89c0 RBX: 000055d38a67df60 RCX: 000055d38a43b910
kernel: RDX: 000055d38ad66000 RSI: 0000000000000000 RDI: 000055d38a67e008
kernel: RBP: 00007ffed18131a0 R08: 000055d38a67ec08 R09: 0000000000000000
kernel: R10: 000055d38a43b000 R11: 00007f1eacf9f2e0 R12: 000055d38a53e7f0
kernel: R13: 0000000000000000 R14: 00007f1eacf9f2e0 R15: 000055d38a43b000
kernel: </TASK>
kernel: Modules linked in: tls veth xt_nat xt_tcpudp xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc cmac nls_utf8 cifs cifs_arc4 cifs_md4 fscache netfs overlay intel_rapl_msr snd_sof_pci_intel_apl snd_sof_intel_hda_common mei_hdcp >
kernel: iwlmvm kvm snd_hda_codec btusb mac80211 btrtl rapl btbcm snd_hda_core btintel intel_cstate libarc4 bluetooth snd_hwdep rtsx_usb_ms ecdh_generic iwlwifi ecc memstick snd_pcm mei_me snd_timer serio_raw snd cfg80211 soundcore mei mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_devintf ipmi_msghandler msr ramoops reed_solomon pstore_blk pstore_zone mtd efi_pstore>
kernel: ---[ end trace 3b7862dbac1d2138 ]---
kernel: RIP: 0010:rmqueue+0x44a/0xbb0
kernel: Code: 04 49 8b 46 10 49 01 db 49 39 c3 0f 84 33 01 00 00 48 be 00 01 00 00 00 00 ad de 49 8b 46 10 48 8b 08 48 8b 50 08 4c 8d 40 f8 <48> 89 51 08 48 89 0a 48 89 30 48 83 c6 22 48 89 70 08 41 83 fd 1f
kernel: RSP: 0000:ffffabf78101bb60 EFLAGS: 00010097
kernel: RAX: ffffd5e688c3b2c8 RBX: ffff9b85f7db6580 RCX: ffdfd5e688d2e148
kernel: RDX: ffff9b85f7db65a0 RSI: dead000000000100 RDI: ffff9b85fffd6b80
kernel: RBP: ffffabf78101bc30 R08: ffffd5e688c3b2c0 R09: 0000000000000001
kernel: R10: 0000000000000293 R11: ffff9b85f7db65a0 R12: ffff9b85fffd6b80
kernel: R13: 0000000000000000 R14: ffff9b85f7db6590 R15: 000000000002d588
kernel: FS: 00007f1eacd0c740(0000) GS:ffff9b85f7d80000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 000055d38ad66000 CR3: 000000010eb0e000 CR4: 0000000000350ee0
有什么想法吗?
谢谢!
答案1
是的,正如@ArturMeinild 指出的那样,经过几次 memtest86 运行后出现了一些错误。
开膛破肚