RAID 中的驱动器发生故障,其他驱动器出现 SMART 错误。该信任哪一个?

RAID 中的驱动器发生故障,其他驱动器出现 SMART 错误。该信任哪一个?

我的机器上已经运行了 RAID1 设置好几年了,最近阵列性能下降了。查看 mdadm 信息,似乎一个驱动器出现故障,但当我查看 SMART 信息时,其他驱动器出现错误。我不确定该相信哪一个。

sudo mdadm --detail /dev/md0如果我正确读取了输出,/dev/sda1则表示已失败,并且/dev/sdb1仍然在数组中,并且可以信任。

/dev/md0:
        Version : 1.2
  Creation Time : Sat Jan  5 01:18:40 2013
     Raid Level : raid1
     Array Size : 2930133824 (2794.39 GiB 3000.46 GB)
  Used Dev Size : 2930133824 (2794.39 GiB 3000.46 GB)
   Raid Devices : 2
  Total Devices : 1
    Persistence : Superblock is persistent

    Update Time : Thu Aug  6 20:33:11 2015
          State : clean, degraded 
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           Name : storm:0  (local to host storm)
           UUID : 98b434f9:54d5c413:1acc4033:8ad34365
         Events : 8388

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       8       17        1      active sync   /dev/sdb1

但是,在两个驱动器上运行简短的 SMART 自检后,/dev/sda没有出现任何问题,但/dev/sdb显示了如下内容:

=== START OF INFORMATION SECTION ===
Device Model:     ST3000DM001-1CH166
...
Local Time is:    Thu Aug  6 20:45:02 2015 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

...

SMART Error Log Version: 1
ATA Error Count: 12 (device log contains only the most recent five errors)

...

Error 12 occurred at disk power-on lifetime: 21016 hours (875 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 00 ff ff ff 4f 00   8d+20:05:45.525  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00   8d+20:05:45.525  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00   8d+20:05:45.525  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00   8d+20:05:45.524  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00   8d+20:05:45.524  SET FEATURES [Set transfer mode]

...


SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     21129         -
# 2  Short offline       Completed without error       00%     18418         -
# 3  Extended offline    Completed without error       00%      1860         -
# 4  Short offline       Completed without error       00%      1855         -

...

完整输出可以在这里找到:http://pastebin.com/jDN0muXk

我是否应该相信 mdadm 说的/dev/sda不好,并且我应该相信/dev/sdb,或者我应该相信 SMART 虽然/dev/sdb有错误,但/dev/sda仍然状况良好?

答案1

两个都试试!只有真正有数据并且你可以读取的那个才是值得信赖的!

老实说,我认为除非非常严重,否则 SMART 错误不会损害驱动器的信誉。我会使用 /dev/sdb,但尽快更换两个驱动器!

相关内容