带有 Ext.4 分区的 MDADM 阵列失败 — “e2fsck:无法在 /dev/md0 上设置超级块标志”

带有 Ext.4 分区的 MDADM 阵列失败 — “e2fsck:无法在 /dev/md0 上设置超级块标志”

发生电源故障,现在我的 mdadm 阵列出现问题。

sudo mdadm -D /dev/md0 [hodge@hodge-fs ~]$ sudo mdadm -D /dev/md0 /dev/md0: 版本 : 0.90 创建时间 : 2010 年 4 月 25 日(星期日) 01:39:25 RAID 级别 : raid5 阵列大小 : 8790815232 (8383.57 GiB 9001.79 GB) 已用设备大小 : 1465135872 (1397.26 GiB 1500.30 GB) RAID 设备 : 7 总设备数 : 7 首选次要设备 : 0 持久性 : 超级块是持久的

    Update Time : Sat Aug  7 19:10:28 2010
          State : clean, degraded, recovering
 Active Devices : 6
Working Devices : 7
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 128K

 Rebuild Status : 10% complete

           UUID : 44a8f730:b9bea6ea:3a28392c:12b22235 (local to host hodge-fs)
         Events : 0.1307608

    Number   Major   Minor   RaidDevice State
       0       8       81        0      active sync   /dev/sdf1
       1       8       97        1      active sync   /dev/sdg1
       2       8      113        2      active sync   /dev/sdh1
       3       8       65        3      active sync   /dev/sde1
       4       8       49        4      active sync   /dev/sdd1
       7       8       33        5      spare rebuilding   /dev/sdc1
       6       8       16        6      active sync   /dev/sdb

sudo 安装 -a

[hodge@hodge-fs ~]$ sudo mount -a
mount: wrong fs type, bad option, bad superblock on /dev/md0,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

sudo fsck.ext4 /dev/md0

[hodge@hodge-fs ~]$ sudo fsck.ext4 /dev/md0
e2fsck 1.41.12 (17-May-2010)
fsck.ext4: Group descriptors look bad... trying backup blocks...
/dev/md0: recovering journal
fsck.ext4: unable to set superblock flags on /dev/md0

sudo dumpe2fs /dev/md0 | grep -i 超级块

[hodge@hodge-fs ~]$ sudo dumpe2fs /dev/md0 | grep -i superblock
dumpe2fs 1.41.12 (17-May-2010)
  Primary superblock at 0, Group descriptors at 1-524
  Backup superblock at 32768, Group descriptors at 32769-33292
  Backup superblock at 98304, Group descriptors at 98305-98828
  Backup superblock at 163840, Group descriptors at 163841-164364
  Backup superblock at 229376, Group descriptors at 229377-229900
  Backup superblock at 294912, Group descriptors at 294913-295436
  Backup superblock at 819200, Group descriptors at 819201-819724
  Backup superblock at 884736, Group descriptors at 884737-885260
  Backup superblock at 1605632, Group descriptors at 1605633-1606156
  Backup superblock at 2654208, Group descriptors at 2654209-2654732
  Backup superblock at 4096000, Group descriptors at 4096001-4096524
  Backup superblock at 7962624, Group descriptors at 7962625-7963148
  Backup superblock at 11239424, Group descriptors at 11239425-11239948
  Backup superblock at 20480000, Group descriptors at 20480001-20480524
  Backup superblock at 23887872, Group descriptors at 23887873-23888396
  Backup superblock at 71663616, Group descriptors at 71663617-71664140
  Backup superblock at 78675968, Group descriptors at 78675969-78676492
  Backup superblock at 102400000, Group descriptors at 102400001-102400524
  Backup superblock at 214990848, Group descriptors at 214990849-214991372
  Backup superblock at 512000000, Group descriptors at 512000001-512000524
  Backup superblock at 550731776, Group descriptors at 550731777-550732300
  Backup superblock at 644972544, Group descriptors at 644972545-644973068
  Backup superblock at 1934917632, Group descriptors at 1934917633-1934918156

sudo e2fsck -b 32768 /dev/md0

[hodge@hodge-fs ~]$ sudo e2fsck -b 32768 /dev/md0
e2fsck 1.41.12 (17-May-2010)
/dev/md0: recovering journal
e2fsck: unable to set superblock flags on /dev/md0

sudo dmesg |尾巴

[hodge@hodge-fs ~]$ sudo dmesg | tail
EXT4-fs (md0): ext4_check_descriptors: Checksum for group 0 failed (59837!=29115)
EXT4-fs (md0): group descriptors corrupted!
EXT4-fs (md0): ext4_check_descriptors: Checksum for group 0 failed (59837!=29115)
EXT4-fs (md0): group descriptors corrupted!

请帮忙!!!

答案1

从您的描述和错误来看,我感觉好像存在一些严重的数据损坏问题。请记住,RAID 可以防止非常具体的问题;有限的磁盘故障。但无法防止断电;这就是为什么您使用 UPS 并保留备份以及使用 RAID 的原因。

我觉得奇怪的是RAID 设备列表中包含了/dev/sdb而不是。这是正确的吗,还是最后一个字符被截断了?/dev/sdb1

我会尝试剩余的备份超级块,以防万一。

除此之外,您还可以寻找磁盘恢复软件。理想情况下,您可以备份磁盘的当前状态;这将降低进一步更改对数据造成不可挽回的损害的可能性。

答案2

您的 RAID 设置有几个缺陷:

  1. 磁盘数≥3—4 的 RAID-5 相当脆弱。一旦一个磁盘被踢出,您的数据就会陷入困境。
  2. 不使用写入意图位图是危险的,并且只会使项目#1变得更糟。
  3. 备用磁盘可能更适合用作 RAID-6 或 RAID-10 的主磁盘……

(我也可以添加小块大小并且不使用 LVM-2 作为缺点,但它们当然不会强烈影响整体状态。)

现在 — 在阵列完全修复之前,切勿对其执行任何操作(fsck 等)。我强烈建议您不要尝试自行恢复数据。您最好找一位专家(当然,如果您重视他们的话)。

相关内容