我有一块 WD Red 4 TB 磁盘(WD40EFRX-68WT0N0,固件 82.00A82),它偶尔会在 SMART 错误日志中显示无法纠正的读取错误,例如:
Error 43 [18] occurred at disk power-on lifetime: 13157 hours (548 days + 5 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 00 02 e9 e0 40 00 Error: UNC at LBA = 0x0002e9e0 = 190944
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 01 00 00 08 00 00 00 02 ea 48 40 00 12d+15:42:14.157 READ FPDMA QUEUED
60 00 e0 00 00 00 00 00 02 e9 68 40 00 12d+15:42:14.157 READ FPDMA QUEUED
60 00 e0 00 08 00 00 00 02 e8 88 40 00 12d+15:42:10.216 READ FPDMA QUEUED
60 01 00 00 00 00 00 00 02 e7 88 40 00 12d+15:42:10.215 READ FPDMA QUEUED
60 01 00 00 08 00 00 00 02 e6 88 40 00 12d+15:42:07.629 READ FPDMA QUEUED
(smartctl 的完整报告这里)
出现最新错误时,zpool status 报告以下内容:
$ zpool status cloudpool
pool: cloudpool
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-9P
scan: scrub repaired 0B in 3h57m with 0 errors on Wed Oct 17 03:53:57 2018
config:
NAME STATE READ WRITE CKSUM
cloudpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-ST8000VN0022-2EL112_ZA17FZXF ONLINE 0 0 0
ata-ST8000VN0022-2EL112_ZA17H5D3 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E5NFLRU3 ONLINE 1 0 0
ata-ST4000VN000-2AH166_WDH0KMHT ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N3EHHA2E ONLINE 0 0 0
ata-ST3000DM001-1CH166_Z1F1HL4V ONLINE 0 0 0
errors: No known data errors
(以前,zpool scrub 的一些运行报告说它已经修复了一些数据,但这是我第一次看到这种新状态)。
然而,运行短期、传送和扩展 SMART 测试并未发现任何问题。
我还认为加载/卸载循环次数高得可疑,但这是一个红色驱动器,而不是绿色驱动器,并且 WD(wd5741.exe)的官方工具报告说无需执行任何操作。
所以我的驱动器是否即将损坏/需要更换,或者这只是正常的偶尔的扇区重新分配?
编辑:虽然我使用的是 ECC RAM,但我的另一个驱动器出现了问题:
pool: cloudpool
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-9P
scan: scrub repaired 768K in 2h56m with 0 errors on Sun Jan 13 03:20:40 2019
config:
NAME STATE READ WRITE CKSUM
cloudpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-ST8000VN0022-2EL112_ZA17FZXF ONLINE 0 0 0
ata-ST8000VN0022-2EL112_ZA17H5D3 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E5NFLRU3 ONLINE 0 0 0
ata-ST4000VN000-2AH166_WDH0KMHT ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N3EHHA2E ONLINE 0 0 6
ata-ST3000DM001-1CH166_Z1F1HL4V ONLINE 0 0 0
errors: No known data errors