我们有一个 raid 10,其中两个驱动器发生故障,但每组中各有一个驱动器仍可正常运行。
当启动救援系统时,元数据似乎很好并且与预期状态一致。
md 的元数据mdadm --detail
如下:
Version : 1.1
Creation Time : Mon Mar 16 15:53:57 2015
Raid Level : raid10
Used Dev Size : 975581184 (930.39 GiB 999.00 GB)
Raid Devices : 4
Total Devices : 2
Persistence : Superblock is persistent
Update Time : Mon May 28 08:52:58 2018
State : active, FAILED, Not Started
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Layout : near=2
Chunk Size : 512K
Name : 2
UUID : 34f4a5fa:4b8e03fa:3119b353:f45188a0
Events : 8618632
Number Major Minor RaidDevice State
0 8 1 0 active sync set-A /dev/sda1
1 8 17 1 active sync set-B /dev/sdb1
4 0 0 4 removed
6 0 0 6 removed
初始化系统无法组建 RAID,内核声称镜像不足。
(...)
md/raid10:md2: not enough operational mirrors.
md: pers->run() failed ...
dracut: mdadm: failed to start array /dev/md2: Input/output error
(...)
尝试手动组建突袭队(mdadm --assemble --readonly --force /dev/md2 /dev/sd[ab]1
)会产生以下结果:
/dev/md2:
Version : 1.1
Raid Level : raid0
Total Devices : 1
Persistence : Superblock is persistent
State : inactive
Name : 2
UUID : 34f4a5fa:4b8e03fa:3119b353:f45188a0
Events : 8618632
Number Major Minor RaidDevice
- 8 1 - /dev/sda1
通过检查--examine
参与驱动器的元数据,我们可以得到与预期状态一致的输出(手动组装之前和之后):
/dev/sda1:
Magic : a92b4efc
Version : 1.1
Feature Map : 0x1
Array UUID : 34f4a5fa:4b8e03fa:3119b353:f45188a0
Name : 2
Creation Time : Mon Mar 16 15:53:57 2015
Raid Level : raid10
Raid Devices : 4
Avail Dev Size : 1951162368 (930.39 GiB 999.00 GB)
Array Size : 1951162368 (1860.77 GiB 1997.99 GB)
Data Offset : 262144 sectors
Super Offset : 0 sectors
Unused Space : before=262064 sectors, after=0 sectors
State : clean
Device UUID : 89288c87:2cf8f6cd:483328b4:fffb3db6
Internal Bitmap : 8 sectors from superblock
Update Time : Mon May 28 08:52:58 2018
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : eaf59503 - correct
Events : 8618632
Layout : near=2
Chunk Size : 512K
Device Role : Active device 0
Array State : AAA. ('A' == active, '.' == missing, 'R' == replacing)
我们知道第三个活动驱动器已被删除,但这不应该是问题的根源。
因此我们的主要问题是:
- 为什么阵列的状态与各个驱动器不一致?
- 如何解决此问题?
记录:CentOS 6,内核版本为 2.6.32-696.30.1.el6.x86_64