dmesg
以下错误每天会出现10-20次:
MCA: Bank 5, Status 0x8c00004000010092
MCA: Global Cap 0x0000000001000c10, Status 0x0000000000000000
MCA: Vendor "GenuineIntel", ID 0x206d7, APIC ID 0
MCA: CPU 0 COR (1) RD channel 2 memory error
MCA: Address 0xbb5561e80 (Mode: Physical Address, LSB: 6)
MCA: Misc 0x2140109086
CPU 始终为 0,“bank”始终为 5。“Misc”和“Address”各不相同,但通常相同。
主板识别如下:
CPU: Intel(R) Xeon(R) CPU E5-1620 0 @ 3.60GHz (3591.44-MHz K8-class CPU)
Origin="GenuineIntel" Id=0x206d7 Family=0x6 Model=0x2d Stepping=7
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Features2=0x1fbee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX>
AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
AMD Features2=0x1<LAHF>
XSAVE Features=0x1<XSAVEOPT>
VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
TSC: P-state invariant, performance statistics
real memory = 137438953472 (131072 MB)
avail memory = 133741539328 (127545 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <LENOVO TC-A0 >
FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s) x 2 hardware threads
我是否应该更换 DIMM(如何识别?),或者 ECC 是否正常工作,无需担心?现在还不需要吗?
添加输出mcelog
:
Hardware event. This is not a software error.
MCE 458
CPU 0 BANK 5 TSC 10283dbf8f01bc
MISC 21401e9e86 ADDR bb5561e80
TIME 1665418335 Mon Oct 10 12:12:15 2022
MCG status:
STATUS cc00010000010092 MCGSTATUS 0
MCGCAP 1000c10 APICID 0 SOCKETID 0
CPUID Vendor Intel Family 6 Model 45 Step 7
答案1
请关注下文。
- 检查 mcelog 是否是硬件或软件问题。
- 拔出并插入 dimm,然后在清洁主板/dimm 插槽后再次查看日志。
- 检查是否可以在 dmesg 中看到 ECC 行
- 如果可能的话,您也可以尝试 memtest。
- 尝试移除/更换 dimm 并检查这是否与 dimm 或主板有关。