我的服务器正在运行基于 mdadm 的 4 磁盘软件 raId 10。
今天早些时候,一场小规模的停电导致服务器关闭,现在它只能启动 InItrd 紧急系统 (ubuntu 16.04),抱怨四个磁盘中有两个“可能已过期”。四个磁盘之间的事件有细微差别。两个有 6531 个事件,两个有 6527 个事件。
我尝试强行重新组装它:
mdadm --assemble /dev/md0 /dev/sdb /dev/sdc /dev/sdd /dev/sde --force -v
以及不使用武力:
mdadm --assemble /dev/md0 /dev/sdb /dev/sdc /dev/sdd /dev/sde -v
它一直抱怨驱动器已过时。
我怎样才能让 mdadm 忽略事件中的细微差别?
输出如下:
root@ubuntu:/home/ubuntu# mdadm --examine /dev/sdc
/dev/sdc:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 8397999f:cd6b31f4:64d31961:759bded9
Name : debian:0
Creation Time : Thu May 4 19:39:57 2017
Raid Level : raid10
Raid Devices : 4
Avail Dev Size : 624880304 (297.97 GiB 319.94 GB)
Array Size : 585850880 (558.71 GiB 599.91 GB)
Used Dev Size : 585850880 (279.36 GiB 299.96 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=39029424 sectors
State : active
Device UUID : 0774b3a6:acc10734:1e7c6f76:98b36729
Internal Bitmap : 8 sectors from superblock
Update Time : Fri Aug 11 13:06:38 2017
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 86b594a0 - correct
Events : 6531
Layout : offset=2
Chunk Size : 512K
Device Role : Active device 0
Array State : AA.A ('A' == active, '.' == missing, 'R' == replacing)
root@ubuntu:/home/ubuntu# mdadm --examine /dev/sdd
/dev/sdd:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 8397999f:cd6b31f4:64d31961:759bded9
Name : debian:0
Creation Time : Thu May 4 19:39:57 2017
Raid Level : raid10
Raid Devices : 4
Avail Dev Size : 976511024 (465.64 GiB 499.97 GB)
Array Size : 585850880 (558.71 GiB 599.91 GB)
Used Dev Size : 585850880 (279.36 GiB 299.96 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=390660144 sectors
State : active
Device UUID : 6e83f7bf:a19005dc:d714aa81:dc11bd5f
Internal Bitmap : 8 sectors from superblock
Update Time : Fri Aug 11 13:06:38 2017
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 9e7cdf7b - correct
Events : 6531
Layout : offset=2
Chunk Size : 512K
Device Role : Active device 1
Array State : AA.A ('A' == active, '.' == missing, 'R' == replacing)
root@ubuntu:/home/ubuntu# mdadm --examine /dev/sde
/dev/sde:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 8397999f:cd6b31f4:64d31961:759bded9
Name : debian:0
Creation Time : Thu May 4 19:39:57 2017
Raid Level : raid10
Raid Devices : 4
Avail Dev Size : 624880304 (297.97 GiB 319.94 GB)
Array Size : 585850880 (558.71 GiB 599.91 GB)
Used Dev Size : 585850880 (279.36 GiB 299.96 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=39029424 sectors
State : active
Device UUID : f4231c7d:5f51fc96:648cb20e:a07f7845
Internal Bitmap : 8 sectors from superblock
Update Time : Fri Aug 11 13:04:25 2017
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 7465b00b - correct
Events : 6527
Layout : offset=2
Chunk Size : 512K
Device Role : Active device 2
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
root@ubuntu:/home/ubuntu# mdadm --examine /dev/sdf
/dev/sdf:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 8397999f:cd6b31f4:64d31961:759bded9
Name : debian:0
Creation Time : Thu May 4 19:39:57 2017
Raid Level : raid10
Raid Devices : 4
Avail Dev Size : 585852560 (279.36 GiB 299.96 GB)
Array Size : 585850880 (558.71 GiB 599.91 GB)
Used Dev Size : 585850880 (279.36 GiB 299.96 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=1680 sectors
State : active
Device UUID : d2440649:3f578698:cf4d7abe:306e7fa3
Internal Bitmap : 8 sectors from superblock
Update Time : Fri Aug 11 13:04:25 2017
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 4d5502a6 - correct
Events : 6527
Layout : offset=2
Chunk Size : 512K
Device Role : Active device 3
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
root@ubuntu:/home/ubuntu# mdadm --assemble /dev/md0 /dev/sdc /dev/sdd /dev/sde /dev/sdf -v --force
mdadm: looking for devices for /dev/md0
mdadm: /dev/sdc is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdd is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sde is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdf is identified as a member of /dev/md0, slot 3.
mdadm: added /dev/sdd to /dev/md0 as 1
mdadm: added /dev/sde to /dev/md0 as 2 (possibly out of date)
mdadm: added /dev/sdf to /dev/md0 as 3 (possibly out of date)
mdadm: added /dev/sdc to /dev/md0 as 0
mdadm: /dev/md0 assembled from 2 drives - not enough to start the array.
root@ubuntu:/home/ubuntu# mdadm --assemble /dev/md0 /dev/sdc /dev/sdd /dev/sde /dev/sdf -v --force --run
mdadm: looking for devices for /dev/md0
mdadm: /dev/sdc is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdd is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sde is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdf is identified as a member of /dev/md0, slot 3.
mdadm: added /dev/sdd to /dev/md0 as 1
mdadm: added /dev/sde to /dev/md0 as 2 (possibly out of date)
mdadm: added /dev/sdf to /dev/md0 as 3 (possibly out of date)
mdadm: added /dev/sdc to /dev/md0 as 0
mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
mdadm: Not enough devices to start the array.
答案1
我在使用 ubuntu-16.04 和 RAID5 时也遇到了同样的问题。尽管所有元数据看起来都很好,而且事件数量也存在微小差异,但mdadm --force
RAID 还是无法启动。
我下载了 mdadm-4.0 源代码,运行 make,并从该目录 () 运行相同的命令./mdadm
,它立即启动。
看来 ubuntu-16.04 确实忽略了这个问题--force
并且使得无法启动阵列,哪怕是最小的问题。
因此对于 16.04:
mdadm: /dev/sda3 is identified as a member of /dev/md1, slot 0.
mdadm: /dev/sdb3 is identified as a member of /dev/md1, slot 1.
mdadm: /dev/sdd3 is identified as a member of /dev/md1, slot 3.
mdadm: added /dev/sdb3 to /dev/md1 as 1
mdadm: no uptodate device for slot 4 of /dev/md1
mdadm: added /dev/sdd3 to /dev/md1 as 3 (possibly out of date)
mdadm: added /dev/sda3 to /dev/md1 as 0
mdadm: /dev/md1 assembled from 2 drives - not enough to start the array.
使用从源代码编译的 mdadm-4.0:
mdadm: looking for devices for /dev/md1
mdadm: /dev/sda3 is identified as a member of /dev/md1, slot 0.
mdadm: /dev/sdb3 is identified as a member of /dev/md1, slot 1.
mdadm: /dev/sdd3 is identified as a member of /dev/md1, slot 3.
mdadm: forcing event count in /dev/sdd3(3) from 15676 upto 15681
mdadm: clearing FAULTY flag for device 2 in /dev/md1 for /dev/sdd3
mdadm: Marking array /dev/md1 as 'clean'
mdadm: added /dev/sdb3 to /dev/md1 as 1
mdadm: no uptodate device for slot 2 of /dev/md1
mdadm: added /dev/sdd3 to /dev/md1 as 3
mdadm: added /dev/sda3 to /dev/md1 as 0
mdadm: /dev/md1 has been started with 3 drives (out of 4).
答案2
请注意,--force
如果 RAID4/5/6 卷脏了并且降级了,则会忽略 。在这种情况下,使用 的--force
行为与不使用 的行为相同--force
,不会产生额外的警告或错误。在我的情况下,它mdadm --assmeble ...
给了我:
failed to start array /dev/md/0: input/output error
在详细日志中,
md/raid:md0: Cannot start dirty degraded array
在dmesg
输出中。
事实上我知道数组是完整的,因为数据没有发生任何事情,我只是弄乱了描述符,所以我需要的是一种真正强制mdadm
启动它的方法,但它拒绝了。
解决方案是将md_mod.start_dirty_degraded
内核参数设置为1
:
https://man7.org/linux/man-pages/man4/md.4.html(搜索md_mod.start_dirty_degraded
接近结尾处。)
仅供参考:这里有一个关于在不同系统上设置内核参数的相当全面的指南:https://wiki.archlinux.org/title/kernel_parameters