我正在尝试运行 raid6 阵列,但它无法启动。
阵列简史:该阵列最初由 6 个磁盘(每个 8TB)构建。
mdadm --create --verbose /dev/md1 --level=6 --raid-devices=6 /dev/sdb1 /dev/sde1 /dev/sdg1 /dev/sdh1 /dev/sdi1 /dev/sdj1
添加 1 个磁盘以扩大阵列
mdadm -v --grow --raid-devices=7 /dev/md1
然后在 gparted 中调整分区大小。
添加了另外 2 个磁盘以扩大阵列,但分区尚未重新调整大小。阵列曾在启动时自动启动,但现在无法启动。
mdadm: failed to start array /dev/md1: Input/output error
以下是一些其他相关输出:
s:~$ mdadm --detail /dev/md1
/dev/md1:
Version : 1.2
Creation Time : Wed Aug 25 16:25:06 2021
Raid Level : raid6
Used Dev Size : 18446744073709551615
Raid Devices : 9
Total Devices : 8
Persistence : Superblock is persistent
Update Time : Wed Oct 6 16:45:06 2021
State : active, FAILED, Not Started
Active Devices : 8
Working Devices : 8
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Consistency Policy : unknown
Name : Octavius:1 (local to host Octavius)
UUID : 80bd1af7:20800c35:be64a577:8b62e937
Events : 198308
Number Major Minor RaidDevice State
- 0 0 0 removed
- 0 0 1 removed
- 0 0 2 removed
- 0 0 3 removed
- 0 0 4 removed
- 0 0 5 removed
- 0 0 6 removed
- 0 0 7 removed
- 0 0 8 removed
- 8 177 5 sync /dev/sdl1
- 8 161 4 sync /dev/sdk1
- 8 145 3 sync /dev/sdj1
- 8 129 2 sync /dev/sdi1
- 8 97 1 sync /dev/sdg1
- 8 49 7 sync /dev/sdd1
- 8 33 0 sync /dev/sdc1
- 8 17 6 sync /dev/sdb1
/dev/sda1 应该是该数组的成员,但已丢失。我不知道为什么所有被删除的设备都会出现。
s:~$ sudo mdadm --examine /dev/sda1
/dev/sda1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 80bd1af7:20800c35:be64a577:8b62e937
Name : Octavius:1 (local to host Octavius)
Creation Time : Wed Aug 25 16:25:06 2021
Raid Level : raid6
Raid Devices : 9
Avail Dev Size : 15627798528 (7451.92 GiB 8001.43 GB)
Array Size : 54697251840 (52163.36 GiB 56009.99 GB)
Used Dev Size : 15627786240 (7451.91 GiB 8001.43 GB)
Data Offset : 251904 sectors
Super Offset : 8 sectors
Unused Space : before=251824 sectors, after=12288 sectors
State : active
Device UUID : 9bddd5dd:790156b1:7b8e38d3:37558974
Internal Bitmap : 8 sectors from superblock
Update Time : Wed Oct 6 16:45:06 2021
Bad Block Log : 512 entries available at offset 40 sectors
Checksum : b23ecdf9 - correct
Events : 0
Layout : left-symmetric
Chunk Size : 512K
Device Role : spare
Array State : AAAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
s:~$ sudo mdadm --examine /dev/sdb1
/dev/sdb1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 80bd1af7:20800c35:be64a577:8b62e937
Name : Octavius:1 (local to host Octavius)
Creation Time : Wed Aug 25 16:25:06 2021
Raid Level : raid6
Raid Devices : 9
Avail Dev Size : 15627798528 (7451.92 GiB 8001.43 GB)
Array Size : 54697251840 (52163.36 GiB 56009.99 GB)
Used Dev Size : 15627786240 (7451.91 GiB 8001.43 GB)
Data Offset : 251904 sectors
Super Offset : 8 sectors
Unused Space : before=251824 sectors, after=12288 sectors
State : active
Device UUID : ffa868e4:ee48f113:bd015c5c:7f92f378
Internal Bitmap : 8 sectors from superblock
Update Time : Wed Oct 6 16:45:06 2021
Bad Block Log : 512 entries available at offset 40 sectors
Checksum : a8fd97fc - correct
Events : 198308
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 6
Array State : AAAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
阵列中所有其他设备的输出与 /dev/sdb1 相同。
任何帮助或建议都会很棒,我可以提供任何其他可能有帮助的输出。
答案1
所以我想我找出了问题所在,尽管我不确定它是如何发生的。
我错过了所有潜水都被标记为备用的事实:
:~$ sudo mdadm --stop /dev/md1
mdadm: stopped /dev/md1
:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : inactive sdi1[2](S) sda1[9](S) sdc1[0](S) sdl1[5](S) sdb1[6](S) sdk1[4](S) sdj1[3](S) sdg1[1](S) sdd1[8](S)
70325093376 blocks super 1.2
所以我想我应该硬着头皮(丢失少量非备份数据的风险很小),停止阵列,然后强制组装。该阵列现在正在重建,数据似乎很好。
:~$ sudo mdadm --assemble --force /dev/md1 /dev/sda1 /dev/sdi1 /dev/sdc1 /dev/sdl1 /dev/sdb1 /dev/sdk1 /dev/sdj1 /dev/sdg1 /dev/sdd1 --verbose
mdadm: looking for devices for /dev/md1
mdadm: /dev/sda1 is identified as a member of /dev/md1, slot -1.
mdadm: /dev/sdi1 is identified as a member of /dev/md1, slot 2.
mdadm: /dev/sdc1 is identified as a member of /dev/md1, slot 0.
mdadm: /dev/sdl1 is identified as a member of /dev/md1, slot 5.
mdadm: /dev/sdb1 is identified as a member of /dev/md1, slot 6.
mdadm: /dev/sdk1 is identified as a member of /dev/md1, slot 4.
mdadm: /dev/sdj1 is identified as a member of /dev/md1, slot 3.
mdadm: /dev/sdg1 is identified as a member of /dev/md1, slot 1.
mdadm: /dev/sdd1 is identified as a member of /dev/md1, slot 7.
mdadm: Marking array /dev/md1 as 'clean'
mdadm: added /dev/sdg1 to /dev/md1 as 1
mdadm: added /dev/sdi1 to /dev/md1 as 2
mdadm: added /dev/sdj1 to /dev/md1 as 3
mdadm: added /dev/sdk1 to /dev/md1 as 4
mdadm: added /dev/sdl1 to /dev/md1 as 5
mdadm: added /dev/sdb1 to /dev/md1 as 6
mdadm: added /dev/sdd1 to /dev/md1 as 7
mdadm: no uptodate device for slot 8 of /dev/md1
mdadm: added /dev/sda1 to /dev/md1 as -1
mdadm: added /dev/sdc1 to /dev/md1 as 0
mdadm: /dev/md1 has been started with 8 drives (out of 9) and 1 spare.
:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid6 sdc1[0] sda1[9] sdd1[8] sdb1[6] sdl1[5] sdk1[4] sdj1[3] sdi1[2] sdg1[1]
54697251840 blocks super 1.2 level 6, 512k chunk, algorithm 2 [9/8] [UUUUUUUU_]
[>....................] recovery = 0.0% (1019952/7813893120) finish=1276.6min speed=101997K/sec
bitmap: 10/59 pages [40KB], 65536KB chunk
我将此视为解决其他类似问题的建议,但是我的阵列的状态(活动、失败、未启动)与我能找到的所有示例不同,并且我最初对强制选项感到不舒服。
希望这对将来的人有帮助......