更换磁盘故障

更换磁盘故障

我的池中有一个磁盘出现故障(出现太多错误)。

The number of I/O errors associated with a ZFS device exceeded
acceptable levels. ZFS has marked the device as faulted.

 impact: Fault tolerance of the pool may be compromised.
    eid: 52
  class: statechange
  state: FAULTED
  host: databank-a
  time: 2021-12-11 16:36:33-0500
  vpath: /dev/disk02_old
  vphys: pci-0000:00:1f.2-ata-4
  vguid: 0x73F7B0B1D1B45864
  devid: /dev/disk02_old
  pool: 0x47B3E7C1336F1F4F

因此,我用一个全新的磁盘替换它(zpool replace pool /dev/foo /dev/bar),但随后它出现故障(我的服务器一直处于睡眠状态,因为我愚蠢地启用了 x-windows),因此我清除了错误(zpool clear pool /dev/bar),但随后它再次发生。

  pool: DATA01
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Wed Dec 15 11:23:57 2021
        6.83T scanned at 256M/s, 5.80T issued at 217M/s, 9.08T total
        232G resilvered, 63.85% done, 0 days 04:24:05 to go
config:

        NAME                        STATE     READ WRITE CKSUM
        DATA01                      DEGRADED     0     0     0
        raidz1-0                    DEGRADED     0     0     0
            /dev/disk01             ONLINE       0     0     0
            replacing-1             UNAVAIL      0     0     0  insufficient replicas
            8356341911383201892     UNAVAIL      0     0     0  was /dev/disk02_old
            /dev/disk02_new         FAULTED      0    81     0  too many errors  (resilvering)
            /dev/disk03             ONLINE       0     0     0
            /dev/disk04             ONLINE       0     0     0


errors: No known data errors

驱动器没有故障的可能性有多大?

答案1

驱动器没有故障的可能性有多大?

可能是驱动器有故障。如果错误计数器正确,则在最初几 TB 的使用中出现数十个错误比预期的要严重。而且您已经清除了错误,因此这不是一次性的瞬态事件。

尽管Backblaze 消费者驱动器故障数据并非您所拥有的,这表明早期故障仍然存在。即使早期死亡率很低,您也可能是几千人中不幸得到不完美产品的人。

开始从单独的媒体对重要数据进行备份恢复测试,以防万一出现最坏的情况。确保有更多备用磁盘。重新镀银完成后,再次检查磁盘。根据需要继续更换它们。

相关内容