DELL ESXI 主机崩溃 - 如何找出故障硬件

DELL ESXI 主机崩溃 - 如何找出故障硬件

我有一台 ESXI 主机,由于硬件问题,它崩溃了好几次。每次我在日志中都会看到:

A bus fatal error was detected on a component at bus 64 device 2 function 0.
A bus fatal error was detected on a component at slot 4.

在控制台上我看到 在此处输入图片描述

二进制中的 64 相当于十六进制中的 40。如果我这样做:

[root@localhost:~] lspci | grep 0000:40:02.0
0000:40:02.0 Bridge: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 PCI 
Express Root Port 2a [PCIe RP[0000:40:02.0]]
[root@localhost:~] 

执行时:

esxcfg-info

寻找 SLOT 4 我得到:

        \==+PCI Device : 
           |----Segment.........................................0x0000 
           |----Bus.............................................0x40 
           |----Slot............................................0x02 
           |----Function........................................0x00 
           |----Runtime Owner...................................vmkernel
           |----Has Configured Owner............................false
           |----Configured Owner................................
           |----Vendor Id.......................................0x8086 
           |----Device Id.......................................0x0e04 
           |----Sub-Vendor Id...................................0x0000 
           |----Sub-Device Id...................................0x0000 
           |----Vendor Name.....................................Intel Corporation
           |----Device Name.....................................Xeon E7 v2/Xeon E5 v2/Core i7 PCI Express Root Port 2a
           |----Device Class....................................1540 
           |----Device Class Name...............................PCI bridge
           |----PIC Line........................................15 
           |----Old IRQ.........................................255 
           |----Vector..........................................0 
           |----PCI Pin.........................................0 
           |----Spawned Bus.....................................66 
           |----Flags...........................................12803 
           \==+BAR Info : 
              \==+BAR0 : 
                 |----Type......................................0 
                 |----Address...................................0 
                 |----Size......................................0 
                 |----Flags.....................................0 
              \==+BAR1 : 
                 |----Type......................................0 
                 |----Address...................................0 
                 |----Size......................................0 
                 |----Flags.....................................0 
           |----Module Id.......................................0 
           |----Chassis.........................................0 
           |----Physical Slot...................................4294967295 
           |----VmKernel Device Name............................PCIe RP[0000:40:02.0]
           |----Slot Description................................SLOT 4
           |----Passthru Capable................................false
           |----Parent Device...................................
           |----Dependent Device................................
           |----Reset Method....................................5
           |----FPT Shareable...................................true

这是不是说明CPU要启动了?

答案1

iDrac 没有显示任何有关硬件的问题?也许您应该在启动屏幕上运行完整诊断程序。

如果我没记错的话:

启动时按 F10。在 Lifecycle Controller 的左侧窗格中,单击“硬件诊断”。在右侧窗格中,单击“运行硬件诊断”。诊断实用程序启动。

相关内容