最近,我在 mdadm 上遇到了一些 RAID 6 驱动器问题。中间没有 LVM。两个驱动器坏了,在机器断电时被拔了出来。现在,我启动系统,但无法安装 RAID。
mdstat 表示突袭处于活动状态
root@NFS:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md127 : active raid6 sdc[5] sdb[3] sde[1] sdd[0]
7813531648 blocks super 1.2 level 6, 512k chunk, algorithm 2 [6/4] [UU_U_U]
但是当我尝试安装它时......
root@NFS:~# sudo mount -t ext4 /dev/md127 /media/NAS
mount: wrong fs type, bad option, bad superblock on /dev/md127,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so
fsck 也一样
root@NFS:~# fsck.ext4 /dev/md127
e2fsck 1.42 (29-Nov-2011)
fsck.ext4: Superblock invalid, trying backup blocks...
fsck.ext4: Bad magic number in super-block while trying to open /dev/md127
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>
因此,我列出了备份超级块
root@NFS:~$ fdisk -l
Disk /dev/sda: 320.1 GB, 320072933376 bytes
255 heads, 63 sectors/track, 38913 cylinders, total 625142448 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000589e8
Device Boot Start End Blocks Id System
/dev/sda1 * 2048 608382975 304190464 83 Linux
/dev/sda2 608385022 625141759 8378369 5 Extended
/dev/sda5 608385024 625141759 8378368 82 Linux swap / Solaris
Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdb doesn't contain a valid partition table
Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdc doesn't contain a valid partition table
Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdd doesn't contain a valid partition table
Disk /dev/sde: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/sde doesn't contain a valid partition table
Disk /dev/md127: 8001.1 GB, 8001056407552 bytes
2 heads, 4 sectors/track, 1953382912 cylinders, total 15627063296 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 524288 bytes / 2097152 bytes
Disk identifier: 0x6f6d8d7f
Disk /dev/md127 doesn't contain a valid partition table
jippen@NFS:~$ sudo mke2fs -n /dev/md127
mke2fs 1.42 (29-Nov-2011)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=128 blocks, Stripe width=512 blocks
244174848 inodes, 1953382912 blocks
97669145 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
59613 block groups
32768 blocks per group, 32768 fragments per group
4096 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000, 214990848, 512000000, 550731776, 644972544, 1934917632
并尝试一些...
root@NFS:~# e2fsck -b 7962624 /dev/md127
e2fsck 1.42 (29-Nov-2011)
e2fsck: Bad magic number in super-block while trying to open /dev/md127
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>
还有希望吗,还是数据彻底丢失了?
编辑 2014/1/12
fdisk -l 的输出
root@NFS:~# fdisk -l /dev/md127
Disk /dev/md127: 8001.1 GB, 8001056407552 bytes
2 heads, 4 sectors/track, 1953382912 cylinders, total 15627063296 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 524288 bytes / 2097152 bytes
Disk identifier: 0x6f6d8d7f
Disk /dev/md127 doesn't contain a valid partition table
编辑添加十六进制转储
root@NFS:~# hexdump -C -n 512 /dev/md127
00000000 45 eb 13 54 f6 24 03 0a 17 ad 30 a9 50 4f bd 48 |E..T.$....0.PO.H|
00000010 8c 7f 20 7f f6 5a 63 0e be bd 7f 4a c2 db ff 00 |.. ..Zc....J....|
00000020 5d c7 dc 8c f2 3f 11 fd 6a 32 00 27 03 3d 41 a7 |]....?..j2.'.=A.|
00000030 b0 c1 cf bf 3f 98 a6 b0 ce 73 df ff 00 ad 5a a5 |....?....s....Z.|
00000040 a0 92 b5 bd 0a e7 04 91 f5 c7 eb 50 ba 16 07 d0 |...........P....|
00000050 f1 fc 85 4c dc 92 07 4e 29 84 85 00 0e c7 9a b5 |...L...N).......|
00000060 b2 b7 64 53 db ee 29 95 c0 20 72 57 ad 64 5f c5 |..dS..).. rW.d_.|
00000070 87 0d ea 09 35 b7 26 77 67 eb fd 2a 9d cc 21 a3 |....5.&wg..*..!.|
00000080 3f 46 cd 6f 4e 5e f1 cd 3d 62 61 51 4e 71 b5 98 |?F.oN^..=baQNq..|
00000090 7a 1a 6d 76 2d 8e 20 a2 8a 28 00 a2 8a 28 00 a2 |z.mv-. ..(...(..|
000000a0 8a 28 00 a2 8a 28 00 a2 8a 28 02 48 f8 39 a7 cd |.(...(...(.H.9..|
000000b0 39 70 06 6a 0a 7c 6b b9 87 a6 69 03 2d 5a db 82 |9p.j.|k...i.-Z..|
000000c0 0b 9e c3 8a ba 67 54 8c 67 8c 1c 0f 5a a4 f7 3e |.....gT.g...Z..>|
000000d0 50 0a 3a 81 8a a8 ee 5c 92 7b d4 38 dd fc c6 4f |P.:....\.{.8...O|
000000e0 71 74 66 24 0e 9c ff 00 3a ad 4a 14 92 2a ec 36 |qtf$....:.J..*.6|
000000f0 d8 5d d8 eb ff 00 d6 aa b5 a2 fd 49 5d 08 22 b7 |.].........I].".|
00000100 32 30 07 8a b4 2d 82 67 d4 54 bb 42 64 83 ce 09 |20...-.g.T.Bd...|
00000110 fe 55 5a 4b 9d a7 03 f1 fa e0 d2 bf bd f2 2d f4 |.UZK..........-.|
00000120 3d 0c 60 b7 d7 a7 e9 4a 57 04 f7 a4 c6 40 f6 c7 |=.`....JW....@..|
00000130 f4 a7 85 d8 07 35 e4 4b 49 47 cd 33 d1 19 8c 90 |.....5.KIG.3....|
00000140 69 a5 70 4e 2a 43 fa d3 86 39 07 be 7f ad 5a 7a |i.pN*C...9....Zz|
00000150 3f eb b1 16 d1 15 66 b5 8a f2 36 46 fb de bf 4a |?.....f...6F...J|
00000160 e5 af f4 b3 04 8d 8e 47 51 f8 e6 ba d2 9f 78 0f |.......GQ.....x.|
00000170 4a ce b9 8f e6 04 8f af d3 9a d2 9b b5 fd 56 84 |J.............V.|
00000180 b8 9c 84 d1 15 04 1f f3 d6 a9 95 c1 c5 76 97 9a |.............v..|
00000190 72 5d c1 b9 38 65 52 7f 9d 72 13 c6 51 d8 1e a0 |r]..8eR..r..Q...|
000001a0 9f d2 bb e9 ce e9 fa 9c f5 23 aa 22 03 6d 38 70 |.........#.".m8p|
000001b0 0e 69 a7 b6 69 73 91 f8 7f 8d 6d 6f 79 7a 18 8b |.i..is....moyz..|
000001c0 d2 99 9e b4 e3 ce 73 4d 3d 6a 7a 07 51 41 e2 9d |......sM=jz.QA..|
000001d0 91 9e 7d 29 99 e2 94 f4 14 2d 9f a8 85 03 3f cb |..}).....-....?.|
000001e0 f9 52 13 82 3f 0f e9 4a 0f 14 df f0 a4 f7 19 20 |.R..?..J....... |
000001f0 e8 7e 94 11 c7 e2 0d 31 4e 0d 3f 39 07 9f f3 cd |.~.....1N.?9....|
00000200
编辑更多命令请求
mdadm --detail
root@NFS:~# mdadm --detail /dev/md127
/dev/md127:
Version : 1.2
Creation Time : Sat Jan 11 20:56:15 2014
Raid Level : raid6
Array Size : 7813531648 (7451.56 GiB 8001.06 GB)
Used Dev Size : 1953382912 (1862.89 GiB 2000.26 GB)
Raid Devices : 6
Total Devices : 4
Persistence : Superblock is persistent
Update Time : Sat Jan 11 22:01:47 2014
State : clean, degraded
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : NFS:127 (local to host NFS)
UUID : 2cde46cb:aa89730c:bb1f214a:cad5bc0a
Events : 2
Number Major Minor RaidDevice State
0 8 48 0 active sync /dev/sdd
1 8 64 1 active sync /dev/sde
2 0 0 2 removed
3 8 16 3 active sync /dev/sdb
4 0 0 4 removed
5 8 32 5 active sync /dev/sdc
和 mdadm --examine
root@NFS:~# mdadm --examine /dev/sd{b,c,d,e,f}
/dev/sdb:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 2cde46cb:aa89730c:bb1f214a:cad5bc0a
Name : NFS:127 (local to host NFS)
Creation Time : Sat Jan 11 20:56:15 2014
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
Array Size : 7813531648 (7451.56 GiB 8001.06 GB)
Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 565612d2:cad5edc9:0848ba81:e06f76a2
Update Time : Wed Jan 15 22:11:32 2014
Checksum : 47b3b275 - correct
Events : 4
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 3
Array State : AA.A.A ('A' == active, '.' == missing)
/dev/sdc:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 2cde46cb:aa89730c:bb1f214a:cad5bc0a
Name : NFS:127 (local to host NFS)
Creation Time : Sat Jan 11 20:56:15 2014
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
Array Size : 7813531648 (7451.56 GiB 8001.06 GB)
Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : a96bd195:c410fc2f:f0422486:6418d34c
Update Time : Wed Jan 15 22:11:32 2014
Checksum : 2047a62f - correct
Events : 4
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 5
Array State : AA.A.A ('A' == active, '.' == missing)
/dev/sdd:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 2cde46cb:aa89730c:bb1f214a:cad5bc0a
Name : NFS:127 (local to host NFS)
Creation Time : Sat Jan 11 20:56:15 2014
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
Array Size : 7813531648 (7451.56 GiB 8001.06 GB)
Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 27e6f0f2:96c41f41:e75d30cc:462cd9bf
Update Time : Wed Jan 15 22:11:32 2014
Checksum : 479d0354 - correct
Events : 4
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 0
Array State : AA.A.A ('A' == active, '.' == missing)
/dev/sde:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 2cde46cb:aa89730c:bb1f214a:cad5bc0a
Name : NFS:127 (local to host NFS)
Creation Time : Sat Jan 11 20:56:15 2014
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
Array Size : 7813531648 (7451.56 GiB 8001.06 GB)
Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : fc780ffd:9daf5c50:9a1e4af7:ff5061c7
Update Time : Wed Jan 15 22:11:32 2014
Checksum : 939a669d - correct
Events : 4
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 1
Array State : AA.A.A ('A' == active, '.' == missing)
答案1
关于十六进制转储,你显示了前 512 个字节,但对于 ext4 文件系统来说,这个值太小了,如果你看到 ext4 的磁盘格式https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Layout您将看到前 1024 个字节被单独留作填充,以允许安装引导扇区。第一个内容从 1024 开始,即(第一个)超级块的内容。
在我的例子中,我的 ext4 文件系统开头包含 1024 个零,我不知道为什么那里会有二进制数据。我的看法是,它可能来自之前对磁盘的使用(也许是一些初始基准测试?)。
$ hexdump -C -n 2048 /dev/md3
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000400 00 00 40 00 ee fe ff 00 bf cc 0c 00 97 f4 51 00 |[email protected].|
00000410 df 29 3c 00 00 00 00 00 02 00 00 00 02 00 00 00 |.)<.............|
00000420 00 80 00 00 00 80 00 00 00 20 00 00 00 ec cf 52 |......... .....R|
00000430 00 ec cf 52 7c 00 ff ff 53 ef 01 00 01 00 00 00 |...R|...S.......|
00000440 3d d6 3d 50 00 00 00 00 00 00 00 00 01 00 00 00 |=.=P............|
00000450 00 00 00 00 0b 00 00 00 00 01 00 00 3c 00 00 00 |............<...|
00000460 46 02 00 00 7b 00 00 00 bd a1 2f a8 89 c2 46 16 |F...{...../...F.|
00000470 89 2b 9e 82 20 e4 ea 3d 70 6f 72 74 61 67 65 00 |.+.. ..=portage.|
00000480 00 00 00 00 00 00 00 00 2f 75 73 72 2f 70 6f 72 |......../usr/por|
00000490 74 61 67 65 2f 64 69 73 74 66 69 6c 65 73 00 00 |tage/distfiles..|
000004a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000004c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc 03 |................|
000004d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000004e0 08 00 00 00 00 00 00 00 00 00 00 00 0c 7d 05 95 |.............}..|
000004f0 55 9b 49 0c a4 8d 9f b6 9d 26 2f 2f 01 01 00 00 |U.I......&//....|
00000500 0c 00 00 00 00 00 00 00 3d d6 3d 50 0a f3 02 00 |........=.=P....|
00000510 04 00 00 00 00 00 00 00 00 00 00 00 ff 7f 00 00 |................|
00000520 00 80 78 00 ff 7f 00 00 01 00 00 00 ff ff 78 00 |..x...........x.|
00000530 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000540 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 |................|
00000550 00 00 00 00 00 00 00 00 00 00 00 00 1c 00 1c 00 |................|
00000560 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000570 00 00 00 00 04 00 00 00 8a 21 c1 04 00 00 00 00 |.........!......|
00000580 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
如果查看此转储,您可以看到前 1024 (0x400) 个字节为 0,并且在超级块的偏移量 0x38(完整输出的 0x438)处有一个带有 ext 文件系统签名的超级块。魔法值定义为 0xEF53,但这些是字节 0x53 0xEF,即小端格式的 16 位值 0xEF53。在我的超级块中,您可以看到文件系统“portage”的标签,以及它最后安装在“/usr/portage/distfiles”等上。您应该在超级块副本中搜索类似的东西。
看来您没有超级块的任何有效副本,因此文件系统可能已崩溃,但为什么呢?为什么两个磁盘都发生故障?在 dmesg 中有什么有趣的东西吗?在 smartctl -a /dev/sd_ 中?在内核日志中,当前两个磁盘发生故障时?机器以前工作正常吗?您正确地关闭了它还是它自己死机了?
RAID6 应该能够承受两个磁盘的硬件故障,但根据故障原因,您可能会丢失数据。最后您可以尝试 testdisk、photorec、sleuthkit 等来尝试恢复文件,但备份恢复应该是更好的选择。
您确实应该调查 RAID6 阵列未能幸存的原因并从中吸取教训,避免陷入同样的境地(无论是什么情况)。
如果可以的话,请将其发布在这里,以便我们大家学习。
答案2
看了发布的十六进制转储后我的想法是 RAID6 的驱动器顺序不知何故被打乱了。我会尝试调查其他三个驱动器的第一个块,并尝试在其中一个驱动器上找到 ext4 超级块。在放弃之前,甚至可能猜测驱动器的每种排列,并偶然找回数据。
免责声明:我个人很久以前就停止使用 RAID5 和 RAID6,转而使用 RAID10,因此我个人对 RAID 模式 5 和 6 的了解有点生疏。 为什么?阅读此内容