所以在过去的几周里,我的家庭实验室服务器遇到了很多麻烦。我最初安装了 Ubuntu 20.04 LTS,然后愚蠢地决定升级dist-upgrade
到 22.04 LTS,这就是我的一些问题的开始。
我已经很擅长修复所有问题了,但后来主板坏了。什么都没发生。我把主板换成了一块能用的,这确实让机器再次启动了,但是:
从那时起,服务器就经常随机冻结。
这不是温度问题 - 我一直在终端上打印 CPU 温度,直到它冻结,结果发现它的温度是 37°C。所以硬件很可能不是罪魁祸首。
实时 USB Pop_OS! 持续运行了两天,直到我将其关闭,所以再次强调:可能不是硬件问题,而是 Ubuntu 的问题。因此,我安装了 linux-crashdump 并等待下一次冻结事件。
读出日志:
sudo cat /var/crash/202401142245/dmesg.202401142245
导致我保存了一个巨大的转储文件Pastebin.com。
不幸的是,以下是未格式化的转储,引起了我的两个疑问:
- 我的内核是否已经损坏且无法修复?
- 如果是,我该如何修复我的内核?
timon@nur1kleinerserver:/var/crash/202401142245$ sudo cat dmesg.202401142245
[sudo] password for timon:
[ 0.000000] microcode: microcode updated early to revision 0xf4, date = 2023-02-22
[ 0.000000] Linux version 5.15.0-91-generic (buildd@lcy02-amd64-045) (gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #101-Ubuntu SMP Tue Nov 14 13:30:08 UTC 2023 (Ubuntu 5.15.0-91.101-generic 5.15.131)
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-91-generic root=UUID=e3292f62-8085-4a4e-afad-e81cb574b283 ro quiet splash crashkernel=512M-:192M vt.handoff=7
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
[ 0.000000] AMD AuthenticAMD
[ 0.000000] Hygon HygonGenuine
[ 0.000000] Centaur CentaurHauls
[ 0.000000] zhaoxin Shanghai
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000057fff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000000058000-0x0000000000058fff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000059000-0x000000000009dfff] usable
[ 0.000000] BIOS-e820: [mem 0x000000000009e000-0x00000000000fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000003fffffff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000040000000-0x00000000403fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000040400000-0x0000000070b3afff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000070b3b000-0x0000000070b3bfff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x0000000070b3c000-0x0000000070b3cfff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000070b3d000-0x000000007a0cbfff] usable
[ 0.000000] BIOS-e820: [mem 0x000000007a0cc000-0x000000007a560fff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000007a561000-0x000000007a5a6fff] ACPI data
[ 0.000000] BIOS-e820: [mem 0x000000007a5a7000-0x000000007a9c0fff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x000000007a9c1000-0x000000007affdfff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000007affe000-0x000000007affefff] usable
[ 0.000000] BIOS-e820: [mem 0x000000007afff000-0x000000007fffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fe000000-0x00000000fe010fff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fed00000-0x00000000fed00fff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000047effffff] usable
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] e820: update [mem 0x6a8f8018-0x6a908e57] usable ==> usable
[ 0.000000] e820: update [mem 0x6a8f8018-0x6a908e57] usable ==> usable
[ 0.000000] extended physical RAM map:
[ 0.000000] reserve setup_data: [mem 0x0000000000000000-0x0000000000057fff] usable
[ 0.000000] reserve setup_data: [mem 0x0000000000058000-0x0000000000058fff] reserved
[ 0.000000] reserve setup_data: [mem 0x0000000000059000-0x000000000009dfff] usable
[ 0.000000] reserve setup_data: [mem 0x000000000009e000-0x00000000000fffff] reserved
[ 0.000000] reserve setup_data: [mem 0x0000000000100000-0x000000003fffffff] usable
[ 0.000000] reserve setup_data: [mem 0x0000000040000000-0x00000000403fffff] reserved
[ 0.000000] reserve setup_data: [mem 0x0000000040400000-0x000000006a8f8017] usable
[ 0.000000] reserve setup_data: [mem 0x000000006a8f8018-0x000000006a908e57] usable
[ 0.000000] reserve setup_data: [mem 0x000000006a908e58-0x0000000070b3afff] usable
[ 0.000000] reserve setup_data: [mem 0x0000000070b3b000-0x0000000070b3bfff] ACPI NVS
[ 0.000000] reserve setup_data: [mem 0x0000000070b3c000-0x0000000070b3cfff] reserved
[ 0.000000] reserve setup_data: [mem 0x0000000070b3d000-0x000000007a0cbfff] usable
[ 0.000000] reserve setup_data: [mem 0x000000007a0cc000-0x000000007a560fff] reserved
[ 0.000000] reserve setup_data: [mem 0x000000007a561000-0x000000007a5a6fff] ACPI data
[ 0.000000] reserve setup_data: [mem 0x000000007a5a7000-0x000000007a9c0fff] ACPI NVS
[ 0.000000] reserve setup_data: [mem 0x000000007a9c1000-0x000000007affdfff] reserved
[ 0.000000] reserve setup_data: [mem 0x000000007affe000-0x000000007affefff] usable
[ 0.000000] reserve setup_data: [mem 0x000000007afff000-0x000000007fffffff] reserved
[ 0.000000] reserve setup_data: [mem 0x00000000e0000000-0x00000000efffffff] reserved
[ 0.000000] reserve setup_data: [mem 0x00000000fe000000-0x00000000fe010fff] reserved
[ 0.000000] reserve setup_data: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
[ 0.000000] reserve setup_data: [mem 0x00000000fed00000-0x00000000fed00fff] reserved
[ 0.000000] reserve setup_data: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
[ 0.000000] reserve setup_data: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
[ 0.000000] reserve setup_data: [mem 0x0000000100000000-0x000000047effffff] usable
[ 0.000000] efi: EFI v2.70 by American Megatrends
[ 0.000000] efi: TPMFinalLog=0x7a990000 ACPI 2.0=0x7a56f000 ACPI=0x7a56f000 SMBIOS=0x7ae08000 SMBIOS 3.0=0x7ae07000 MEMATTR=0x78515418 ESRT=0x7ae04e98 MOKvar=0x7ae23000 RNG=0x7a56e018 TPMEventLog=0x70c3c018
[ 0.000000] random: crng init done
[ 0.000000] secureboot: Secure boot disabled
[ 0.000000] SMBIOS 3.1.1 present.
[ 0.000000] DMI: Intel 0/NUC7i5BNB, BIOS BNKBL357.86A.0088.2022.0125.1102 01/25/2022
[ 0.000000] tsc: Detected 2200.000 MHz processor
[ 0.000000] tsc: Detected 2199.996 MHz TSC
[ 0.000981] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[ 0.000985] e820: remove [mem 0x000a0000-0x000fffff] usable
[ 0.001000] last_pfn = 0x47f000 max_arch_pfn = 0x400000000
[ 0.001282] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT
[ 0.002573] last_pfn = 0x7afff max_arch_pfn = 0x400000000
[ 0.014062] esrt: Reserving ESRT space from 0x000000007ae04e98 to 0x000000007ae04ed0.
[ 0.014077] Using GB pages for direct mapping
[ 0.014848] secureboot: Secure boot disabled
[ 0.014849] RAMDISK: [mem 0x6a909000-0x6e7adfff]
[ 0.014858] ACPI: Early table checksum verification disabled
[ 0.014862] ACPI: RSDP 0x000000007A56F000 000024 (v02 INTEL )
转储已缩短,发布在此处。完整转储位于 pastebin.com 上
根据 @zwets 的说法,我曾经glxinfo
收集过有关主板显示单元的信息。它说:
供应商:英特尔(0x8086)
设备:Mesa Intel(R)Iris(R)Plus Graphics 640(Kaby Lake GT3e)(KBL GT3)(0x5926)
版本:23.0.4
加速:是
视频内存:15681MB
统一内存:是
首选配置文件:核心(0x1)
最大核心配置文件版本:4.6
最大兼容配置文件版本:4.6
最大 GLES1 配置文件版本:1.1
最大 GLES[23] 配置文件版本:3.2
OpenGL 供应商字符串:英特尔
OpenGL 渲染器字符串:Mesa Intel(R)Iris(R)Plus Graphics 640(Kaby Lake GT3e)(KBL GT3)
OpenGL 核心配置文件版本字符串:4.6(核心配置文件) Mesa 23.0.4-0ubuntu1~22.04.1
OpenGL 核心配置文件着色语言版本字符串:4.60
所以它可能是一个略有不同的主板 - 但内核不应该能够处理它吗?