内核日志显示许多以下 EDAC 错误实例:
EDAC MC0: 1 CE ie31200 CE on unknown memory (csrow:3 channel:1 page:0x0 offset:0x0 grain:1 syndrome:0x1c)
问题是......我的系统上没有csrow #3
(为方便查看,输出被截断):
$ ls -l /sys/devices/system/edac/mc/mc0
drwxr-xr-x 3 root root 0 May 19 10:53 csrow0
drwxr-xr-x 3 root root 0 May 19 10:53 csrow1
怎么会这样?实际上内存设备故障?如何识别是哪个设备?
更多可能有帮助的信息:
$ cat /sys/devices/system/edac/mc/mc0/ce_count
1069
$ cat /sys/devices/system/edac/mc/mc0/csrow?/ce_count
0
0
$ sudo edac-util -v
mc0: 0 Uncorrected Errors with no DIMM info
mc0: 0 Corrected Errors with no DIMM info
mc0: csrow0: 0 Uncorrected Errors
mc0: csrow0: mc#0csrow#0channel#0: 0 Corrected Errors
mc0: csrow0: mc#0csrow#0channel#1: 0 Corrected Errors
mc0: csrow1: 0 Uncorrected Errors
mc0: csrow1: mc#0csrow#1channel#0: 0 Corrected Errors
mc0: csrow1: mc#0csrow#1channel#1: 0 Corrected Errors
edac-util: No errors to report.
- 操作系统:ArchLinux / 5.17.8-arch1-1 #1 SMP PREEMPT
- 处理器:Xeon E-2124
- 主板:SuperMicro X11SCH-LN4F
谢谢