zpool 中的硬盘显示错误,但后来似乎没问题。我如何判断是否出了问题?

zpool 中的硬盘显示错误,但后来似乎没问题。我如何判断是否出了问题?

我的工作电脑在 Ubuntu 系统的 zpool 中设置了 4 个硬盘。我接受过程序员培训,不是 IT 培训,但我对管理我的电脑负有部分责任。前几天重启后,我注意到池未安装,这是 zpool status 命令的输出:

pool: zhoupool
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
    invalid.  Sufficient replicas exist for the pool to continue
    functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: scrub repaired 0 in 1h48m with 0 errors on Sun Mar 12 03:12:25 2017
config:

    NAME                                 STATE     READ WRITE CKSUM
    zhoupool                             DEGRADED     0     0     0
      mirror-0                           ONLINE       0     0     0
        ata-ST3000DM001-1ER166_Z500GM2P  ONLINE       0     0     0
        ata-ST3000DM001-1ER166_Z500GMZ3  ONLINE       0     0     0
      mirror-1                           DEGRADED     0     0     0
        11645674422250617741             UNAVAIL      0     0     0  was /dev/disk/by-id/ata-ST3000DM001-1ER166_Z500GP0C-part1
        ata-ST3000DM001-1ER166_Z500GVM5  ONLINE       0     0     0

errors: No known data errors

我本想更换硬盘,但后来发现池已挂载(自初始错误以来,机器至少重新启动过一次)。zpool status 输出现在为:

 pool: zhoupool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
    attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
    using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: scrub repaired 31.0G in 2h10m with 0 errors on Sun May 14 02:34:46 2017

config:

    NAME                                 STATE     READ WRITE CKSUM
    zhoupool                             ONLINE       0     0     0
      mirror-0                           ONLINE       0     0     0
        ata-ST3000DM001-1ER166_Z500GM2P  ONLINE       0     0     0
        ata-ST3000DM001-1ER166_Z500GMZ3  ONLINE       0     0     0
      mirror-1                           ONLINE       0     0     0
        ata-ST3000DM001-1ER166_Z500GP0C  ONLINE       0     0  258K
        ata-ST3000DM001-1ER166_Z500GVM5  ONLINE       0     0     0

errors: No known data errors

这仍然表示有错误,所以我仍在努力订购新硬盘来替换它。但是我现在注意到 zpool 状态没有指示任何错误:

  pool: zhoupool
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
    still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
    the pool may no longer be accessible by software that does not support
    the features. See zpool-features(5) for details.
  scan: scrub repaired 0 in 2h11m with 0 errors on Sun Jul  9 02:35:48 2017
config:

    NAME                                 STATE     READ WRITE CKSUM
    zhoupool                             ONLINE       0     0     0
      mirror-0                           ONLINE       0     0     0
        ata-ST3000DM001-1ER166_Z500GM2P  ONLINE       0     0     0
        ata-ST3000DM001-1ER166_Z500GMZ3  ONLINE       0     0     0
      mirror-1                           ONLINE       0     0     0
        ata-ST3000DM001-1ER166_Z500GP0C  ONLINE       0     0     0
        ata-ST3000DM001-1ER166_Z500GVM5  ONLINE       0     0     0

errors: No known data errors

那么我还应该担心吗?真的是硬盘故障吗,还是软件故障导致了错误?我该如何诊断?

答案1

您的数据应该是安全的。看起来 5/14 的清理工作已经清理干净,后续的清理工作也运行正常。检查 dmesg 以查看该设备是否出现超时/错误。

您应该使用 smartmontools 从驱动器收集 SMART 数据、检查状态并偶尔运行在线检查。(这里有一个不错的描述:https://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu)很有可能,这不会是最后一次出现这种故障。

相关内容