降级磁盘在重启后恢复在线状态

降级磁盘在重启后恢复在线状态

我的 FreeNAS 服务器 [版本 9.10.2-U1 (86c7ef5)] 中的 ZFS 卷上有一个降级磁盘,在尝试更换它之前,我重新启动了服务器。

以下是什么意思?我的磁盘有问题吗?

  • 在启动时,即使所有磁盘的卷状态均已恢复在线,我仍会收到以下信息:警报

  • 在清理操作期间,新的警报显示磁盘处于降级状态,校验和为 670(不确定这意味着什么):磁盘降级新警报

  • 擦洗结果:
    The scrub operation is now finished. Here are the final results:
    
         state: DEGRADED
        status: One or more devices has experienced an unrecoverable error.  An
                attempt was made to correct the error.  Applications are unaffected.
    
        action: Determine if the device needs to be replaced, and clear the errors
                using 'zpool clear' or replace the device with 'zpool replace'.
    
           see: http://illumos.org/msg/ZFS-8000-9P
    
          scan: scrub repaired 66.7M in 16h55m with 0 errors on Sat Jan  2 13:32:13 2021
    
        config:
          NAME                                            STATE     READ WRITE CKSUM
          storage                                         DEGRADED     0     0     0
            raidz1-0                                      DEGRADED     0     0     0
              gptid/e0ef3f08-70b6-11e6-b8eb-1c98ec0f2cd4  ONLINE       0     0     0
              gptid/e1b21671-70b6-11e6-b8eb-1c98ec0f2cd4  DEGRADED     0     0 1.29K  too many errors
              gptid/e2841c02-70b6-11e6-b8eb-1c98ec0f2cd4  ONLINE       0     0     0
              gptid/e3717f0c-70b6-11e6-b8eb-1c98ec0f2cd4  ONLINE       0     0     0
    
        errors: No known data errors
    

  • smartctl -a
    SMART Error Log Version: 1
    No Errors Logged
    
    SMART Self-test log structure revision number 1
    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    # 1  Extended offline    Completed: read failure       90%     39365         172825824
    # 2  Extended offline    Completed: read failure       90%     39365         172825825
    # 3  Short offline       Completed without error       00%     39364         -
    
    SMART Selective self-test log data structure revision number 1
     SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
        1        0        0  Not_testing
        2        0        0  Not_testing
        3        0        0  Not_testing
        4        0        0  Not_testing
        5        0        0  Not_testing
    
    Selective self-test flags (0x0):
      After scanning selected spans, do NOT read-scan remainder of disk.
    
    If Selective self-test is pending on power-up, resume after 0 minute delay.
    

答案1

正如输出所示smartctl -a,驱动器报告了板载测试的读取错误。这排除了 RAID 控制器或软件问题。

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%     39365         172825824
# 2  Extended offline    Completed: read failure       90%     39365         172825825

这真是太糟了。尽快找到新的驱动器并更换它。错误可能是暂时的,因为它似乎发生在磁盘上的同一物理位置附近 - FreeNAS/zfs 可能不会再次访问该确切位置,直到您发出清除命令并告诉它检查整个卷,这就是驱动器在下次启动时重新上线的原因。

相关内容