Ubuntu Server 20.04 冻结/无响应,内核 BUG:进程中的页面状态错误

Ubuntu Server 20.04 冻结/无响应,内核 BUG:进程中的页面状态错误

服务器信息:分销商 ID:Ubuntu 描述:Ubuntu 20.04.1 LTS 发布:20.04 代号:focal

今天我发现我的新服务器崩溃/冻结了。它不再接受任何 ssh 连接,当我尝试插入键盘和显示器时,什么也没发生。

于是我按下了重置按钮,等待它再次上线——成功了。我直接进入 journalctl 查看发生了什么:

Oct 22 06:25:01 j4105 CRON[1200366]: pam_unix(cron:session): session closed for user root
Oct 22 06:33:41 j4105 systemd[1]: Starting Daily apt upgrade and clean activities...
Oct 22 06:33:49 j4105 dbus-daemon[616]: [system] Activating via systemd: service name='org.freedesktop.PackageKit' unit='packagekit.service' requested by ':1.1159' (uid=0 pid=1201070 comm="/usr/bin/gdbus call --system --dest org.freedeskto" label="unconfined")
Oct 22 06:33:49 j4105 systemd[1]: Starting PackageKit Daemon...
Oct 22 06:33:49 j4105 PackageKit[1201073]: daemon start
Oct 22 06:33:49 j4105 dbus-daemon[616]: [system] Successfully activated service 'org.freedesktop.PackageKit'
Oct 22 06:33:49 j4105 systemd[1]: Started PackageKit Daemon.
Oct 22 06:33:57 j4105 kernel: BUG: Bad page state in process jbd2/sda2-8  pfn:14030e
Oct 22 06:33:57 j4105 kernel: page:fffff270c500c380 refcount:0 mapcount:0 mapping:0000000000000000 index:0x2
Oct 22 06:33:57 j4105 kernel: flags: 0x17ffffc0000000()
Oct 22 06:33:57 j4105 kernel: raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000
Oct 22 06:33:57 j4105 kernel: raw: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000080
Oct 22 06:33:57 j4105 kernel: page dumped because: page still charged to cgroup
Oct 22 06:33:57 j4105 kernel: page->mem_cgroup:0000000000000080
Oct 22 06:33:57 j4105 kernel: Modules linked in: ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs cpuid xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bpfilter br_netfilter bridge stp llc aufs overlay nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua mei_hdcp intel_telemetry_pltdrv intel_punit_ipc intel_rapl_msr intel_telemetry_core intel_pmc_ipc x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm rapl intel_cstate mei_me processor_thermal_device intel_rapl_common mei intel_soc_dts_iosf int3400_thermal mac_hid acpi_thermal_rel int3403_thermal int3406_thermal dptf_power int340x_thermal_zone sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear i915 i2c_algo_bit drm_kms_helper crct10dif_pclmul crc32_pclmul
Oct 22 06:33:57 j4105 kernel:  ghash_clmulni_intel syscopyarea sysfillrect aesni_intel sysimgblt fb_sys_fops crypto_simd i2c_i801 ahci cryptd r8169 drm glue_helper realtek libahci video pinctrl_geminilake pinctrl_intel
Oct 22 06:33:57 j4105 kernel: CPU: 0 PID: 312 Comm: jbd2/sda2-8 Not tainted 5.4.0-48-generic #52-Ubuntu
Oct 22 06:33:57 j4105 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4105-ITX, BIOS P1.40 08/06/2018
Oct 22 06:33:57 j4105 kernel: Call Trace:
Oct 22 06:33:57 j4105 kernel:  dump_stack+0x6d/0x9a
Oct 22 06:33:57 j4105 kernel:  bad_page.cold+0x80/0xb1
Oct 22 06:33:57 j4105 kernel:  check_new_page_bad+0x67/0x80
Oct 22 06:33:57 j4105 kernel:  rmqueue+0xa9e/0xf00
Oct 22 06:33:57 j4105 kernel:  get_page_from_freelist+0x1bb/0x390
Oct 22 06:33:57 j4105 kernel:  __alloc_pages_nodemask+0x173/0x320
Oct 22 06:33:57 j4105 kernel:  alloc_pages_current+0x87/0xe0
Oct 22 06:33:57 j4105 kernel:  alloc_slab_page+0x17b/0x300
Oct 22 06:33:57 j4105 kernel:  allocate_slab+0x7d/0x4b0
Oct 22 06:33:57 j4105 kernel:  ? percpu_counter_add_batch+0x50/0x70
Oct 22 06:33:57 j4105 kernel:  new_slab+0x4a/0x70
Oct 22 06:33:57 j4105 kernel:  ___slab_alloc+0x32c/0x590
Oct 22 06:33:57 j4105 kernel:  ? alloc_buffer_head+0x1f/0x60
Oct 22 06:33:57 j4105 kernel:  ? _ext4_get_block+0xe4/0x110
Oct 22 06:33:57 j4105 kernel:  __slab_alloc+0x20/0x40
Oct 22 06:33:57 j4105 kernel:  kmem_cache_alloc+0x20d/0x230
Oct 22 06:33:57 j4105 kernel:  ? alloc_buffer_head+0x1f/0x60
Oct 22 06:33:57 j4105 kernel:  alloc_buffer_head+0x1f/0x60
Oct 22 06:33:57 j4105 kernel:  jbd2_journal_write_metadata_buffer+0x47/0x3b0
Oct 22 06:33:57 j4105 kernel:  ? bmap+0x1f/0x30
Oct 22 06:33:57 j4105 kernel:  ? jbd2_journal_bmap+0x28/0x50
Oct 22 06:33:57 j4105 kernel:  jbd2_journal_commit_transaction+0x6bd/0x17e8
Oct 22 06:33:57 j4105 kernel:  ? __switch_to_asm+0x40/0x70
Oct 22 06:33:57 j4105 kernel:  kjournald2+0xb6/0x280
Oct 22 06:33:57 j4105 kernel:  ? wait_woken+0x80/0x80
Oct 22 06:33:57 j4105 kernel:  kthread+0x104/0x140
Oct 22 06:33:57 j4105 kernel:  ? commit_timeout+0x20/0x20
Oct 22 06:33:57 j4105 kernel:  ? kthread_park+0x90/0x90
Oct 22 06:33:57 j4105 kernel:  ret_from_fork+0x1f/0x40
Oct 22 06:33:57 j4105 kernel: Disabling lock debugging due to kernel taint
Oct 22 06:33:57 j4105 kernel: BUG: Bad page state in process jbd2/sda2-8  pfn:14031a
Oct 22 06:33:57 j4105 kernel: page:fffff270c500c680 refcount:0 mapcount:0 mapping:0000000000000000 index:0x2
Oct 22 06:33:57 j4105 kernel: flags: 0x17ffffc0000004(uptodate)
Oct 22 06:33:57 j4105 kernel: raw: 0017ffffc0000004 dead000000000100 dead000000000122 0000000000000000
Oct 22 06:33:57 j4105 kernel: raw: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000
Oct 22 06:33:57 j4105 kernel: page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set
Oct 22 06:33:57 j4105 kernel: bad because of flags: 0x4(uptodate)
Oct 22 06:33:57 j4105 kernel: Modules linked in: ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs cpuid xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bpfilter br_netfilter bridge stp llc aufs overlay nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua mei_hdcp intel_telemetry_pltdrv intel_punit_ipc intel_rapl_msr intel_telemetry_core intel_pmc_ipc x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm rapl intel_cstate mei_me processor_thermal_device intel_rapl_common mei intel_soc_dts_iosf int3400_thermal mac_hid acpi_thermal_rel int3403_thermal int3406_thermal dptf_power int340x_thermal_zone sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear i915 i2c_algo_bit drm_kms_helper crct10dif_pclmul crc32_pclmul
Oct 22 06:33:57 j4105 kernel:  ghash_clmulni_intel syscopyarea sysfillrect aesni_intel sysimgblt fb_sys_fops crypto_simd i2c_i801 ahci cryptd r8169 drm glue_helper realtek libahci video pinctrl_geminilake pinctrl_intel
Oct 22 06:33:57 j4105 kernel: CPU: 0 PID: 312 Comm: jbd2/sda2-8 Tainted: G    B             5.4.0-48-generic #52-Ubuntu
Oct 22 06:33:57 j4105 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4105-ITX, BIOS P1.40 08/06/2018
Oct 22 06:33:57 j4105 kernel: Call Trace:
Oct 22 06:33:57 j4105 kernel:  dump_stack+0x6d/0x9a
Oct 22 06:33:57 j4105 kernel:  bad_page.cold+0x80/0xb1
Oct 22 06:33:57 j4105 kernel:  check_new_page_bad+0x67/0x80
Oct 22 06:33:57 j4105 kernel:  rmqueue+0xa9e/0xf00
Oct 22 06:33:57 j4105 kernel:  get_page_from_freelist+0x1bb/0x390
Oct 22 06:33:57 j4105 kernel:  __alloc_pages_nodemask+0x173/0x320
Oct 22 06:33:57 j4105 kernel:  alloc_pages_current+0x87/0xe0
Oct 22 06:33:57 j4105 kernel:  alloc_slab_page+0x17b/0x300
Oct 22 06:33:57 j4105 kernel:  allocate_slab+0x7d/0x4b0
Oct 22 06:33:57 j4105 kernel:  ? percpu_counter_add_batch+0x50/0x70
Oct 22 06:33:57 j4105 kernel:  new_slab+0x4a/0x70
Oct 22 06:33:57 j4105 kernel:  ___slab_alloc+0x32c/0x590
Oct 22 06:33:57 j4105 kernel:  ? alloc_buffer_head+0x1f/0x60
Oct 22 06:33:57 j4105 kernel:  ? _ext4_get_block+0xe4/0x110
Oct 22 06:33:57 j4105 kernel:  __slab_alloc+0x20/0x40
Oct 22 06:33:57 j4105 kernel:  kmem_cache_alloc+0x20d/0x230
Oct 22 06:33:57 j4105 kernel:  ? alloc_buffer_head+0x1f/0x60
Oct 22 06:33:57 j4105 kernel:  alloc_buffer_head+0x1f/0x60
Oct 22 06:33:57 j4105 kernel:  jbd2_journal_write_metadata_buffer+0x47/0x3b0
Oct 22 06:33:57 j4105 kernel:  ? bmap+0x1f/0x30
Oct 22 06:33:57 j4105 kernel:  ? jbd2_journal_bmap+0x28/0x50
Oct 22 06:33:57 j4105 kernel:  jbd2_journal_commit_transaction+0x6bd/0x17e8
Oct 22 06:33:57 j4105 kernel:  ? __switch_to_asm+0x40/0x70
Oct 22 06:33:57 j4105 kernel:  kjournald2+0xb6/0x280
Oct 22 06:33:57 j4105 kernel:  ? wait_woken+0x80/0x80
Oct 22 06:33:57 j4105 kernel:  kthread+0x104/0x140
Oct 22 06:33:57 j4105 kernel:  ? commit_timeout+0x20/0x20
Oct 22 06:33:57 j4105 kernel:  ? kthread_park+0x90/0x90
Oct 22 06:33:57 j4105 kernel:  ret_from_fork+0x1f/0x40
Oct 22 06:33:57 j4105 kernel: BUG: Bad page state in process jbd2/sda2-8  pfn:14031c

它好像在启动每日“apt upgrade” cronjob(立即禁用)后崩溃了,然后就挂了。

互联网上的一些答案建议测试内存,我照做了。4 个小时的 memtest86 显示 0 个错误。

有人知道那里发生了什么事以及我该如何解决它吗?

相关内容