rhel 意外重启 + 消息文件

rhel 意外重启 + 消息文件

我们的rhel 7.5机器会执行意外重启

/var/log/messages在重新启动之前,我们可以从文件中看到以下几行

知道这条线如何表明机器重新启动吗?

May  8 21:46:01 server_mng kernel: system 00:00: [io  0x1000-0x103f] could not be reserved
May  8 21:46:01 server_mng kernel: system 00:00: [io  0x1040-0x104f] has been reserved
May  8 21:46:01 server_mng kernel: system 00:00: [io  0x0cf0-0x0cf1] has been reserved
May  8 21:46:01 server_mng kernel: system 00:04: [mem 0xfed00000-0xfed003ff] has been reserved
May  8 21:46:01 server_mng kernel: system 00:05: [io  0xfce0-0xfcff] has been reserved
May  8 21:46:01 server_mng kernel: system 00:05: [mem 0xf0000000-0xf7ffffff] has been reserved
May  8 21:46:01 server_mng kernel: system 00:05: [mem 0xfe800000-0xfe9fffff] has been reserved
May  8 21:46:01 server_mng kernel: pnp: PnP ACPI: found 6 devices
May  8 21:46:01 server_mng kernel: ACPI: bus type PNP unregistered
May  8 21:46:01 server_mng kernel: pci 0000:00:15.0: BAR 15: assigned [mem 0xc0000000-0xc01fffff 64bit pref]
May  8 21:46:01 server_mng kernel: pci 0000:00:16.0: BAR 15: assigned [mem 0xc0200000-0xc03fffff 64bit pref]
May  8 21:46:01 server_mng kernel: pci 0000:00:0f.0: BAR 6: assigned [mem 0xc0400000-0xc0407fff pref]
May  8 21:46:01 server_mng kernel: pci 0000:00:15.3: BAR 13: no space for [io  size 0x1000]
May  8 21:46:01 server_mng kernel: pci 0000:00:15.3: BAR 13: failed to assign [io  size 0x1000]
May  8 21:46:01 server_mng kernel: pci 0000:00:15.4: BAR 13: no space for [io  size 0x1000]
May  8 21:46:01 server_mng kernel: pci 0000:00:15.4: BAR 13: failed to assign [io  size 0x1000]
May  8 21:46:01 server_mng kernel: pci 0000:00:01.0: PCI bridge to [bus 01]
May  8 21:46:01 server_mng kernel: pci 0000:02:01.0: BAR 6: assigned [mem 0xfd500000-0xfd50ffff pref]
May  8 21:46:01 server_mng kernel: pci 0000:00:11.0: PCI bridge to [bus 02]
May  8 21:46:01 server_mng kernel: pci 0000:00:11.0:   bridge window [io  0x2000-0x3fff]
May  8 21:46:01 server_mng kernel: pci 0000:00:11.0:   bridge window [mem 0xfd500000-0xfdffffff]
May  8 21:46:01 server_mng kernel: pci 0000:00:11.0:   bridge window [mem 0xe7b00000-0xe7ffffff 64bit pref]
May  8 21:46:01 server_mng kernel: pci 0000:03:00.0: BAR 6: assigned [mem 0xfd400000-0xfd40ffff pref]
May  8 21:46:01 server_mng kernel: pci 0000:00:15.0: PCI bridge to [bus 03]
May  8 21:46:01 server_mng kernel: pci 0000:00:15.0:   bridge window [io  0x4000-0x4fff]
May  8 21:46:01 server_mng kernel: pci 0000:00:15.0:   bridge window [mem 0xfd400000-0xfd4fffff]
May  8 21:46:01 server_mng kernel: pci 0000:00:15.0:   bridge window [mem 0xc0000000-0xc01fffff 64bit pref]
May  8 21:46:01 server_mng kernel: pci 0000:00:15.1: PCI bridge to [bus 04]
May  8 21:46:01 server_mng kernel: pci 0000:00:15.1:   bridge window [io  0x8000-0x8fff]
May  8 21:46:01 server_mng kernel: pci 0000:00:15.1:   bridge window [mem 0xfd000000-0xfd0fffff]
May  8 21:46:01 server_mng kernel: pci 0000:00:15.1:   bridge window [mem 0xe7800000-0xe78fffff 64bit pref]
May  8 21:46:01 server_mng kernel: pci 0000:00:15.2: PCI bridge to [bus 05]

答案1

这些消息是系统扫描硬件配置并将系统资源分配给各种设备的结果。通常,您会在启动序列的早期部分看到这些消息,基本上是在引导加载程序加载内核并启动它之后。

如果系统错误地分配资源,可能会导致系统立即崩溃。在这种情况下,崩溃前记录/显示的最后一条消息可能有助于内核开发人员识别哪些资源分配不正确,以及错误分配的性质(重叠分配?尝试分配没有意义的配置?别的?)。如果您选择了更详细的引导过程(在 RHEL 中,通常删除引导选项rhgbquiet),所有这些消息都将显示为引导消息。

如果系统具有可热插拔 PCI/PCI-X/PCIe/Thunderbolt 设备,您可能会在热插拔时看到一小群类似消息。但是,同时存在 PnP ACPI 资源分配和 PCI 资源分配,以及存在针对如此多不同 PCI 设备的消息的事实支持这些消息可能来自引导过程的结论。 PCI 热插拔事件通常会产生一组具有更有限的 PCI 设备 ID 号集的消息。

此输出看起来像是在扫描(虚拟)机拥有的基本上所有(虚拟)设备,并且通常仅在启动时发生。

在对意外的系统崩溃进行故障排除时,通常只记录消息重新启动(如果有的话)对于找出崩溃原因最有用。

如果在重新启动之前没有记录任何异常消息,则可能意味着已在虚拟化主机级别检测到问题,并且它已终止并重新启动虚拟机 - 相当于虚拟机级别的kill -9某种情况。或者这可能意味着问题影响了存储驱动程序,因此内核无法将错误消息写入日志。

相关内容