我的突袭失败了,我不确定应该采取什么最佳步骤才能最好地恢复它。
我在 raid5 配置中安装了 4 个驱动器。似乎其中一个驱动器发生故障 ( sde1
),但md
无法启动阵列,因为它sdd1
显示不新鲜
我能做些什么来恢复阵列?
下面我粘贴了/var/log/messages
和的一些摘录mdadm --examine
:
/var/log/messages
$ egrep -w sd[b,c,d,e]\|raid\|md /var/log/messages
nas kernel: [...] sd 5:0:0:0: [sde]
nas kernel: [...] sd 5:0:0:0: [sde] CDB:
nas kernel: [...] end_request: I/O error, dev sde, sector 937821218
nas kernel: [...] sd 5:0:0:0: [sde] killing request
nas kernel: [...] md/raid:md0: read error not correctable (sector 937821184 on sde1).
nas kernel: [...] md/raid:md0: Disk failure on sde1, disabling device.
nas kernel: [...] md/raid:md0: Operation continuing on 2 devices.
nas kernel: [...] md/raid:md0: read error not correctable (sector 937821256 on sde1).
nas kernel: [...] sd 5:0:0:0: [sde] Unhandled error code
nas kernel: [...] sd 5:0:0:0: [sde]
nas kernel: [...] sd 5:0:0:0: [sde] CDB:
nas kernel: [...] end_request: I/O error, dev sde, sector 937820194
nas kernel: [...] sd 5:0:0:0: [sde] Synchronizing SCSI cache
nas kernel: [...] sd 5:0:0:0: [sde]
nas kernel: [...] sd 5:0:0:0: [sde] Stopping disk
nas kernel: [...] sd 5:0:0:0: [sde] START_STOP FAILED
nas kernel: [...] sd 5:0:0:0: [sde]
nas kernel: [...] md: unbind<sde1>
nas kernel: [...] md: export_rdev(sde1)
nas kernel: [...] md: bind<sdd1>
nas kernel: [...] md: bind<sdc1>
nas kernel: [...] md: bind<sdb1>
nas kernel: [...] md: bind<sde1>
nas kernel: [...] md: kicking non-fresh sde1 from array!
nas kernel: [...] md: unbind<sde1>
nas kernel: [...] md: export_rdev(sde1)
nas kernel: [...] md: kicking non-fresh sdd1 from array!
nas kernel: [...] md: unbind<sdd1>
nas kernel: [...] md: export_rdev(sdd1)
nas kernel: [...] md: raid6 personality registered for level 6
nas kernel: [...] md: raid5 personality registered for level 5
nas kernel: [...] md: raid4 personality registered for level 4
nas kernel: [...] md/raid:md0: device sdb1 operational as raid disk 2
nas kernel: [...] md/raid:md0: device sdc1 operational as raid disk 0
nas kernel: [...] md/raid:md0: allocated 4338kB
nas kernel: [...] md/raid:md0: not enough operational devices (2/4 failed)
nas kernel: [...] md/raid:md0: failed to run raid set.
nas kernel: [...] md: pers->run() failed ...
mdadm --examine
$ mdadm --examine /dev/sd[bcdefghijklmn]1
/dev/sdb1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 4dc53f9d:f0c55279:a9cb9592:a59607c9
Name : NAS:0
Creation Time : Sun Sep 11 02:37:59 2011
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 3907027053 (1863.02 GiB 2000.40 GB)
Array Size : 5860538880 (5589.05 GiB 6001.19 GB)
Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : e8369dbc:bf591efa:f0ccc359:9d164ec8
Update Time : Tue May 27 18:54:37 2014
Checksum : a17a88c0 - correct
Events : 1026050
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 2
Array State : A.A. ('A' == active, '.' == missing)
/dev/sdc1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 4dc53f9d:f0c55279:a9cb9592:a59607c9
Name : NAS:0
Creation Time : Sun Sep 11 02:37:59 2011
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 3907027053 (1863.02 GiB 2000.40 GB)
Array Size : 5860538880 (5589.05 GiB 6001.19 GB)
Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 78221e11:02acc1c8:c4eb01bf:f0852cbe
Update Time : Tue May 27 18:54:37 2014
Checksum : 1fbb54b8 - correct
Events : 1026050
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 0
Array State : A.A. ('A' == active, '.' == missing)
/dev/sdd1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 4dc53f9d:f0c55279:a9cb9592:a59607c9
Name : NAS:0
Creation Time : Sun Sep 11 02:37:59 2011
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 3907027053 (1863.02 GiB 2000.40 GB)
Array Size : 5860538880 (5589.05 GiB 6001.19 GB)
Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : fd282483:d2647838:f6b9897e:c216616c
Update Time : Mon Oct 7 19:21:22 2013
Checksum : 6df566b8 - correct
Events : 32621
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 3
Array State : AAAA ('A' == active, '.' == missing)
/dev/sde1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 4dc53f9d:f0c55279:a9cb9592:a59607c9
Name : NAS:0
Creation Time : Sun Sep 11 02:37:59 2011
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 3907027053 (1863.02 GiB 2000.40 GB)
Array Size : 5860538880 (5589.05 GiB 6001.19 GB)
Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : e84657dd:0882a7c8:5918b191:2fc3da02
Update Time : Tue May 27 18:46:12 2014
Checksum : 33ab6fe - correct
Events : 1026039
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 1
Array State : AAA. ('A' == active, '.' == missing)
答案1
你遇到了双驱动器故障,其中一个驱动器坏了六个月。使用 RAID5,这是无法恢复的。更换故障硬件并从备份中恢复。
展望未来,请考虑使用像这样的大型驱动器的 RAID6,确保您已设置监控来捕捉设备故障,以便能够尽快做出响应。
答案2
好吧,如果您的备份不是最新的,您可以尝试使用三个驱动器在降级模式下进行强制重组......
mdadm -v --assemble --force /dev/md0 /dev/sdb1 /dev/sdc1 /dev/sde1
由于 sde1 的更新时间和事件计数仅略有不同步,我猜您将能够访问大部分数据。我在类似的 RAID5 故障情况下多次成功完成此操作。
- sdb1 更新时间:2014 年 5 月 27 日星期二 18:54:37
- sdc1 更新时间:2014 年 5 月 27 日星期二 18:54:37
- sdd1 更新时间:2013年10月7日星期一 19:21:22
- sde1 更新时间:2014 年 5 月 27 日星期二 18:46:12
- sdb1 事件:1026050
- sdc1 活动:1026050
- sdd1 活动:32621
- sde1 活动:1026039