/var/crash 中发生了几次崩溃,这是最后一次:-
ProblemType: KernelOops
Annotation: Your system might become unstable now and might need to be restarted.
Date: Fri Jul 23 18:10:54 2021
Failure: oops
OopsText:
BUG: Bad rss-counter state mm:00000000c098a229 idx:2 val:-1
usblp0: removed
usblp 1-5:1.0: usblp0: USB Bidirectional printer dev 3 if 0 alt 0 proto 2 vid 0x04F9 pid 0x02EC
<44>[ 18.329026] systemd-journald[358]: File /var/log/journal/b022dca21fd4480baeeb84f47ab439d3/user-1000.journal corrupted or uncleanly shut down, renaming and replacing.
vboxdrv: loading out-of-tree module taints kernel.
vboxdrv: module verification failed: signature and/or required key missing - tainting kernel
vboxdrv: Found 8 processor cores
vboxdrv: TSC mode is Invariant, tentative frequency 2303999142 Hz
vboxdrv: Successfully loaded version 6.1.24 r145767 (interface 0x00300000)
VBoxNetFlt: Successfully started.
VBoxNetAdp: Successfully started.
Bluetooth: RFCOMM TTY layer initialized
Bluetooth: RFCOMM socket layer initialized
Bluetooth: RFCOMM ver 1.11
rfkill: input handler disabled
[UFW BLOCK] IN=enp3s0f1 OUT= MAC=01:00:5e:00:00:01:80:20:da:95:bc:56:08:00 SRC=192.168.1.254 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0x00 TTL=1 ID=0 DF PROTO=2
[UFW BLOCK] IN=wlp2s0 OUT= MAC=01:00:5e:00:00:01:80:20:da:95:bc:56:08:00 SRC=192.168.1.254 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0x00 TTL=1 ID=0 DF PROTO=2
[UFW BLOCK] IN=enp3s0f1 OUT= MAC=01:00:5e:00:00:01:80:20:da:95:bc:56:08:00 SRC=192.168.1.254 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0x00 TTL=1 ID=0 DF PROTO=2
[UFW BLOCK] IN=wlp2s0 OUT= MAC=01:00:5e:00:00:01:80:20:da:95:bc:56:08:00 SRC=192.168.1.254 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0x00 TTL=1 ID=0 DF PROTO=2
[UFW BLOCK] IN=enp3s0f1 OUT= MAC=01:00:5e:00:00:01:80:20:da:95:bc:56:08:00 SRC=192.168.1.254 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0x00 TTL=1 ID=0 DF PROTO=2
[UFW BLOCK] IN=wlp2s0 OUT= MAC=01:00:5e:00:00:01:80:20:da:95:bc:56:08:00 SRC=192.168.1.254 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0x00 TTL=1 ID=0 DF PROTO=2
Package: linux-image-4.15.0-151-generic 4.15.0-151.157
SourcePackage: linux
Tags: kernel-oops
Uname: Linux 4.15.0-151-generic x86_64
---------------------------------------------------------------------------------------
The system is a laptop from Entroware based on Clevo and has 8 logical CPUs:-
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 158
Model name: Intel(R) Core(TM) i5-8300H CPU @ 2.30GHz
Stepping: 10
CPU MHz: 2000.295
CPU max MHz: 4000.0000
CPU min MHz: 800.0000
BogoMIPS: 4599.93
Virtualisation: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
NUMA node0 CPU(s): 0-7
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d
USB Config:-
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 004: ID 5986:2110 Acer, Inc
Bus 001 Device 003: ID 04f9:02ec Brother Industries, Ltd MFC-J870DW
Bus 001 Device 005: ID 8087:07dc Intel Corp. Bluetooth wireless interface
Bus 001 Device 002: ID 0d8c:0104 C-Media Electronics, Inc. CM103+ Audio Controller
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
PCI Config:-
00:00.0 Host bridge: Intel Corporation Device 3e10 (rev 07)
00:02.0 VGA compatible controller: Intel Corporation Device 3e9b
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th Gen Core Processor Gaussian Mixture Model
00:12.0 Signal processing controller: Intel Corporation Cannon Lake PCH Thermal Controller (rev 10)
00:14.0 USB controller: Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller (rev 10)
00:14.2 RAM memory: Intel Corporation Cannon Lake PCH Shared SRAM (rev 10)
00:16.0 Communication controller: Intel Corporation Cannon Lake PCH HECI Controller (rev 10)
00:17.0 SATA controller: Intel Corporation Device a353 (rev 10)
00:1d.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port 9 (rev f0)
00:1d.5 PCI bridge: Intel Corporation Device a335 (rev f0)
00:1d.6 PCI bridge: Intel Corporation Device a336 (rev f0)
00:1f.0 ISA bridge: Intel Corporation Device a30d (rev 10)
00:1f.3 Audio device: Intel Corporation Cannon Lake PCH cAVS (rev 10)
00:1f.4 SMBus: Intel Corporation Cannon Lake PCH SMBus Controller (rev 10)
00:1f.5 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH SPI Controller (rev 10)
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981
02:00.0 Network controller: Intel Corporation Wireless 3160 (rev 93)
03:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTL8411B PCI Express Card Reader (rev 01)
03:00.1 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 12)
自从使用 4.15.0-151 以来,这种情况才开始出现。恢复到 4.15.0-147 可使系统稳定。
答案1
也可以确认,我也一样 - 这篇文章让我很开心,因为我并不孤单。我几乎可以肯定我的主板有问题,因为 SMART 和 memtest 没有报告任何错误。
除了上述行为之外,经常会出现暂停/恢复甚至重新启动挂起的情况。
由于这是一个安全更新(https://ubuntu.com/security/notices/USN-5018-1),似乎速度比稳定性测试更重要……
我按照此处的说明降级内核:如何在错误更新后降级内核(16.04)
(请记住在新的修复内核可用后重新启用,这对于安全问题来说很重要)
答案2
问题正在此处跟踪:https://bugs.launchpad.net/bugs/1938013。
答案3
我有完全相同的症状。从周五开始,在更新后,我偶尔会死机,只有完全关闭电源再打开电源才能恢复,我的交换分区损坏了,不得不重新格式化它,不得不在一个 NVMe 根分区以及两个外部驱动器上进行救援,然后我将内核恢复到 4.15.0-147,然后奇迹般地……一切都再次完美运行。我浪费了一个周末试图了解出了什么问题,直到我发现是内核的问题。我在 /var/crash 中也发现了大量内核崩溃的情况。
miles@unicron:/var/crash$ ls -latr linux*
-rw-r--r-- 1 kernoops whoopsie 763 Jul 21 13:57 linux-image-4.15.0-151-generic.32331.crash
-rw-r--r-- 1 kernoops whoopsie 763 Jul 21 14:33 linux-image-4.15.0-151-generic.32439.crash
-rw-r--r-- 1 kernoops whoopsie 988 Jul 21 14:34 linux-image-4.15.0-151-generic.53592.crash
-rw-r--r-- 1 kernoops whoopsie 3464 Jul 21 14:52 linux-image-4.15.0-151-generic.271760.crash
-rw-r--r-- 1 kernoops whoopsie 3677 Jul 22 03:52 linux-image-4.15.0-151-generic.258222.crash
-rw-r--r-- 1 kernoops whoopsie 736 Jul 22 19:17 linux-image-4.15.0-151-generic.32747.crash
-rw-r--r-- 1 kernoops whoopsie 742 Jul 22 19:17 linux-image-4.15.0-151-generic.32859.crash
-rw-r--r-- 1 kernoops whoopsie 455 Jul 23 03:04 linux-image-4.15.0-151-generic.13044.crash
-rw-r--r-- 1 kernoops whoopsie 530 Jul 23 13:06 linux-image-4.15.0-151-generic.20048.crash
-rw-r--r-- 1 kernoops whoopsie 673 Jul 23 13:14 linux-image-4.15.0-151-generic.30505.crash
-rw-r--r-- 1 kernoops whoopsie 1893 Jul 23 22:57 linux-image-4.15.0-151-generic.123785.crash
-rw-r--r-- 1 kernoops whoopsie 4163 Jul 23 23:28 linux-image-4.15.0-151-generic.305134.crash
-rw-r--r-- 1 kernoops whoopsie 1013 Jul 24 06:57 linux-image-4.15.0-151-generic.48875.crash
-rw-r--r-- 1 kernoops whoopsie 1209 Jul 24 07:01 linux-image-4.15.0-151-generic.65884.crash
-rw-r--r-- 1 kernoops whoopsie 2516 Jul 24 07:02 linux-image-4.15.0-151-generic.165751.crash
-rw-r--r-- 1 kernoops whoopsie 2678 Jul 24 07:07 linux-image-4.15.0-151-generic.178891.crash
-rw-r--r-- 1 kernoops whoopsie 3500 Jul 25 11:32 linux-image-4.15.0-151-generic.253271.crash
样本来自linux-image-4.15.0-151-generic.253271.crash
:
ProblemType: KernelOops
Annotation: Your system might become unstable now and might need to be restarted.
Date: Sun Jul 25 11:32:27 2021
Failure: oops
OopsText:
general protection fault: 0000 [#1] SMP PTI
Modules linked in: xfs libcrc32c uas usb_storage rfcomm ccm ip6table_filter ip6_tables iptable_filter v4l2loopback(OE) snd_hrtimer cmac bnep binfmt_misc nls_iso8859_1 snd_hda_codec_hdmi nvidia_drm(POE) intel_rapl x86_pkg_temp_thermal nvidia_modeset(POE) intel_powerclamp coretemp arc4 kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul snd_hda_codec_realtek ghash_clmulni_intel snd_hda_codec_generic nvidia(POE) pcbc iwlmvm mac80211 snd_hda_intel aesni_intel snd_hda_codec aes_x86_64 crypto_simd glue_helper asus_nb_wmi cryptd asus_wmi snd_hda_core intel_cstate snd_hwdep intel_rapl_perf serio_raw sparse_keymap intel_wmi_thunderbolt iwlwifi snd_pcm snd_seq_midi snd_seq_midi_event cfg80211 uvcvideo btusb btrtl videobuf2_vmalloc btbcm snd_rawmidi videobuf2_memops btintel videobuf2_v4l2 drm_kms_helper
bluetooth snd_seq xpad videobuf2_core ff_memless ecdh_generic drm videodev snd_seq_device snd_timer media fb_sys_fops snd syscopyarea sysfillrect sysimgblt mei_me idma64 soundcore virt_dma input_leds joydev mei processor_thermal_device intel_lpss_pci int340x_thermal_zone shpchp intel_pch_thermal intel_lpss intel_soc_dts_iosf elan_i2c mac_hid asus_wireless int3400_thermal acpi_pad acpi_thermal_rel sch_fq_codel ppa parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_asus hid_generic usbhid nvme r8169 ahci nvme_core mii libahci wmi i2c_hid hid video pinctrl_sunrisepoint
CPU: 4 PID: 81 Comm: kswapd0 Tainted: P OE 4.15.0-151-generic #157-Ubuntu
Hardware name: ASUSTeK COMPUTER INC. G752VT/G752VT, BIOS G752VT.213 01/06/2016
RIP: 0010:find_get_entries+0x68/0x200
RSP: 0018:ffffb54cc384f9d0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 000000000000000e RCX: 0000000000000006
RDX: 1800000000000000 RSI: 0000000000001000 RDI: ffff9730446816d0
RBP: ffffb54cc384fa30 R08: 0000000000000800 R09: 0000000000000006
R10: ffff9730446817f8 R11: 0000000000000000 R12: ffffb54cc384faf8
R13: ffffb54cc384fa78 R14: 000000000000000c R15: ffff9730446817f8
FS: 0000000000000000(0000) GS:ffff973606500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000a520680c000 CR3: 00000005c260a005 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
pagevec_lookup_entries+0x1e/0x30
truncate_inode_pages_range+0x127/0x960
? xfs_mount_validate_sb+0x440/0x500 [xfs]
? __inode_wait_for_writeback+0x7e/0xf0
? bit_waitqueue+0x40/0x40
truncate_inode_pages_final+0x4c/0x60
evict+0x188/0x1a0
dispose_list+0x39/0x50
prune_icache_sb+0x5a/0x80
super_cache_scan+0x137/0x1b0
shrink_slab.part.49+0x1e7/0x440
shrink_node+0x2e1/0x2f0
kswapd+0x2b1/0x710
kthread+0x121/0x140
? mem_cgroup_shrink_node+0x190/0x190
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x35/0x40
Code: c7 45 a8 00 00 00 00 48 89 75 b0 45 31 ff 4d 85 ff 0f 84 01 01 00 00 49 8b 17 48 85 d2 74 69 48 89 d0 83 e0 03 0f 85 5f 01 00 00 <48> 8b 42 20 48 8d 78 ff a8 01 48 0f 44 fa 8b 47 1c 85 c0 74 d7
RIP: find_get_entries+0x68/0x200 RSP: ffffb54cc384f9d0
---[ end trace aafa3a2a2c51a63e ]---
Package: linux-image-4.15.0-151-generic 4.15.0-151.157
SourcePackage: linux
Tags: kernel-oops
Uname: Linux 4.15.0-151-generic x86_64
Kubuntu 多年以来一直在该系统上稳定运行,内存测试正常,驱动器无 SMART 错误等。我只能将其缩小到新的内核更新。
目前,我想到的唯一解决方案是将 GRUB 设置为默认启动旧内核,并保留新内核,以防我想尝试对其进行进一步修改。为此,我使用了此处所示的解决方案:https://unix.stackexchange.com/questions/198003/set-default-kernel-in-grub
答案4
感谢 Stefan Bader 的贡献,并经我本人(原始记者)确认:-
此错误已在软件包 linux - 4.15.0-153.160 中修复
linux(4.15.0-153.160)仿生;紧急程度=中等
bionic/linux:4.15.0-153.160 - 建议的跟踪器(LP:#1938319)
4.15.0-151 冻结了各种 CPU(LP:#1938013)
- mac80211:修复 EAPOL 处理中的内存损坏
-- Stefan Bader 2021 年 7 月 29 日星期四 08:26:59 +0200