服务器意外关闭(BSOD),并显示消息“WHEA_UNCORRECTABLE_ERROR”

服务器意外关闭(BSOD),并显示消息“WHEA_UNCORRECTABLE_ERROR”

当我们检查系统事件日志时,我们发现以下警告已被重复记录。

Event 17
A corrected hardware error has occurred.
Component: PCI Express Root Port
Error Source: Advanced Error Reporting (PCI Express)
Bus:Device:Function: 0x0:0x2:0x0
Vendor ID:Device ID: 0x8086:0x6F04
Class Code: 0x30400

当系统意外关闭(BSOD)时,会记录以下错误。

Event 16
A fatal hardware error has occurred.
Component: PCI Express Root Port
Error Source: Advanced Error Reporting (PCI Express)
Bus:Device:Function: 0x0:0x2:0x0
Vendor ID:Device ID: 0x8086:0x6F04
Class Code: 0x30400

尽管自服务器计算机创建以来(2021 年 3 月 27 日),每天都会记录警告(事件 17),但系统仅意外关闭(20-7-21)一次,并出现上述错误(事件 16)。

BSOD 的崩溃转储分析:

Crash dump file: D:\MEMORY.DMP
This was probably caused by the following module: pci.sys (pci+0x1364B)
Bug check code: 0x124 (0x4, 0xFFFFE000C7D1E038, 0x0, 0x0)
Error: WHEA_UNCORRECTABLE_ERROR
File path: C:\Windows\system32\drivers\pci.sys
Product: Microsoft® Windows® Operating System
Company: Microsoft Corporation
Description: NT Plug and Play PCI Enumerator
Bug check description: This bug check indicates that a fatal hardware error has occurred. This bug check uses the error data that is provided by the Windows Hardware Error Architecture (WHEA).
This is likely to be caused by a hardware problem.
The crash took place in a Microsoft module. Your system configuration may be incorrect. Possibly this problem is caused by another driver on your system that cannot be identified at this time.

我们已尝试

我们已更新至最新的 Windows Server 2012 R2 (v6.3.9600 Build 9600)

所有相关驱动已更新至最新版本

PCI.sys 已更新至最新版本 (v6.3.9600.18939)

服务器详细信息:

Motherboard: AsrockRack Server Board EP2C612D16NM-2T8R
Raid: Dell (LSI OEM) 9341-8I mega raid (Latest Firmware)
Processor: Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10 GHz, 2100 MHz
OS: Microsoft Windows Server 2012 R2 Standard
OS Version: 6.3.9600 Build 9600

答案1

如果您已经将操作系统和驱动程序更新到最新版本,那么也许您应该考虑将固件服务器也更新到最新版本。您收到的错误消息也指向有故障的硬件,因为错误文本是 PCI 相关组件。其他原因可能是您的服务器过热。

您可以通过多种其他方式尝试解决此问题以及文件。

我希望这对你有帮助。

相关内容