我有一个 5 驱动器软件 raid6,并设置了 mdadm(2 个奇偶校验驱动器),其中一个驱动器发生故障。我订购了一个替换件,当我关闭机器电源以将故障驱动器换成新驱动器时,另一个驱动器同时发生故障(完全坏了)。所以现在有 3 个旧驱动器有数据,1 个新驱动器正在重建,还有 1 个驱动器丢失。
然后我注意到重建速度非常慢,数据只能以 100 kb/s 的速度运行。以前的重建速度大约为 100 MB/s!我决定购买一台带有新驱动器的 Synology 设备,并尽可能多地复制数据。它已经运行了 2 个月,我已经能够复制几 TB,但还有几 TB 需要复制,按照这个速度,还需要 6 个月才能完成。
进入新 NAS(Synology)的数据没有问题,到目前为止没有数据丢失!我希望可以做些什么来尝试让它运行得更快。错误日志表明它在特定驱动器(sdd)上出现故障,但也许有一个设置会告诉它“更快地失败”,以便它复制得更快,因为它实际上并没有失败?日志如下:
cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid6 sdf1[5] sdb1[0] sdc1[1] sdd1[2]
17581168128 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/3] [UUU__]
[>....................] recovery = 0.4% (24696932/5860389376) finish=584364702.6min speed=0K/sec
unused devices: <none>
/var/log/messages 的尾部
Dec 16 11:29:47 [localhost] kernel: ata4.00: status: { DRDY ERR }
Dec 16 11:29:47 [localhost] kernel: ata4.00: error: { UNC }
Dec 16 11:29:47 [localhost] kernel: ata4: hard resetting link
Dec 16 11:29:47 [localhost] kernel: ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Dec 16 11:29:47 [localhost] kernel: ata4.00: configured for UDMA/133
Dec 16 11:29:47 [localhost] kernel: sd 3:0:0:0: [sdd] tag#24 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Dec 16 11:29:47 [localhost] kernel: sd 3:0:0:0: [sdd] tag#24 Sense Key : Medium Error [current] [descriptor]
Dec 16 11:29:47 [localhost] kernel: sd 3:0:0:0: [sdd] tag#24 Add. Sense: Unrecovered read error - auto reallocate failed
Dec 16 11:29:47 [localhost] kernel: sd 3:0:0:0: [sdd] tag#24 CDB: Read(16) 88 00 00 00 00 00 02 87 64 50 00 00 00 40 00 00
Dec 16 11:29:47 [localhost] kernel: blk_update_request: I/O error, dev sdd, sector 42427472
Dec 16 11:29:47 [localhost] kernel: raid5_end_read_request: 5 callbacks suppressed
Dec 16 11:29:47 [localhost] kernel: md/raid:md0: read error not correctable (sector 42425424 on sdd1).
Dec 16 11:29:47 [localhost] kernel: md/raid:md0: read error not correctable (sector 42425432 on sdd1).
Dec 16 11:29:47 [localhost] kernel: md/raid:md0: read error not correctable (sector 42425440 on sdd1).
Dec 16 11:29:47 [localhost] kernel: md/raid:md0: read error not correctable (sector 42425448 on sdd1).
Dec 16 11:29:47 [localhost] kernel: md/raid:md0: read error not correctable (sector 42425456 on sdd1).
Dec 16 11:29:47 [localhost] kernel: md/raid:md0: read error not correctable (sector 42425464 on sdd1).
Dec 16 11:29:47 [localhost] kernel: md/raid:md0: read error not correctable (sector 42425472 on sdd1).
Dec 16 11:29:47 [localhost] kernel: md/raid:md0: read error not correctable (sector 42425480 on sdd1).
Dec 16 11:29:47 [localhost] kernel: ata4: EH complete
Dec 16 11:29:51 [localhost] kernel: ata4.00: exception Emask 0x0 SAct 0x10000000 SErr 0x0 action 0x0
Dec 16 11:29:51 [localhost] kernel: ata4.00: irq_stat 0x40000008
Dec 16 11:29:51 [localhost] kernel: ata4.00: failed command: READ FPDMA QUEUED
Dec 16 11:29:51 [localhost] kernel: ata4.00: cmd 60/38:e0:30:b8:f5/00:00:02:00:00/40 tag 28 ncq 28672 in#012 res 41/40:00:30:b8:f5/00:00:02:00:00/00 Emask 0x409 (media error) <F>
答案1
所以超级用户上有几个与此类似的帖子,但都没有得到答复。这是因为您应该先使用 ddrescue 来修复卷,然后 rsync 就可以了。
https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID