MD:从阵列中踢掉非新鲜SDG! md/raid:md0: 然后没有足够的可操作设备(3/7 失败)

MD:从阵列中踢掉非新鲜SDG! md/raid:md0: 然后没有足够的可操作设备(3/7 失败)

今天我在一场灾难中奔跑......

我有一个带有 7 个硬盘的 RAID 6,昨天有一个磁盘出现故障。更换磁盘并整晚进行重建后,我发现第二个 HDD 已脱离 RAID...
所以今天我开始在外部驱动器上备份我的文件,但随后复制停止了,正如我检查过的原因并在 Webmins RAID 中看到 sdg 已“关闭”。
我关闭服务器并检查硬件,发现连接硬盘的背板丢失了......

修复后,所有驱动器现在都恢复了,但我的 RAID 6 不再启动:-/

dmesg shows me:
md: kicking non-fresh sdg from array!
md: kicking non-fresh sdf from array!
md: kicking non-fresh sde from array!
md/raid:md0: not enough operational devices (3/7 failed)
...
and after many
md0: ADD_NEW_DISK not supported
I can read this:
EXT4-fs (md0): unable to read superblock

sudo mdadm --examine检查了 sdg、sdf 和 sde,e 和 f 显示“ State clean”,其中在修复之前“关闭”的 sdg 显示“ Active”。因此,7 个设备中有 6 个显示“干净”(除了 sdg 之外)。
以下是所有设备的输出列表:

Disk sdb
/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : e866cf54:90d5c74e:fe00b6e7:d25c82f4
           Name : N5550:0  (local to host N5550)
  Creation Time : Fri Oct 29 14:43:58 2021
     Raid Level : raid6
   Raid Devices : 7

 Avail Dev Size : 3906770096 (1862.89 GiB 2000.27 GB)
     Array Size : 9766906880 (9314.45 GiB 10001.31 GB)
  Used Dev Size : 3906762752 (1862.89 GiB 2000.26 GB)
    Data Offset : 259072 sectors
   Super Offset : 8 sectors
   Unused Space : before=258992 sectors, after=7344 sectors
          State : clean
    Device UUID : 9180f101:1dacdd9e:4adae9c4:fbeb2552

Internal Bitmap : 8 sectors from superblock
    Update Time : Sat Mar 26 18:13:45 2022
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : 38019182 - correct
         Events : 256508

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAA.A.. ('A' == active, '.' == missing, 'R' == replacing)
Disk sdc
/dev/sdc:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : e866cf54:90d5c74e:fe00b6e7:d25c82f4
           Name : N5550:0  (local to host N5550)
  Creation Time : Fri Oct 29 14:43:58 2021
     Raid Level : raid6
   Raid Devices : 7

 Avail Dev Size : 3906770096 (1862.89 GiB 2000.27 GB)
     Array Size : 9766906880 (9314.45 GiB 10001.31 GB)
  Used Dev Size : 3906762752 (1862.89 GiB 2000.26 GB)
    Data Offset : 259072 sectors
   Super Offset : 8 sectors
   Unused Space : before=258992 sectors, after=7344 sectors
          State : clean
    Device UUID : 889c6877:5ee5c647:eebd209c:d9c6abcb

Internal Bitmap : 8 sectors from superblock
    Update Time : Sat Mar 26 18:13:45 2022
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : a71ea53d - correct
         Events : 256508

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAA.A.. ('A' == active, '.' == missing, 'R' == replacing)
Disk sdd
/dev/sdd:
   MBR Magic : aa55
Partition[0] :   3907026944 sectors at         2048 (type fd)
Disk sde
/dev/sde:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : e866cf54:90d5c74e:fe00b6e7:d25c82f4
           Name : N5550:0  (local to host N5550)
  Creation Time : Fri Oct 29 14:43:58 2021
     Raid Level : raid6
   Raid Devices : 7

 Avail Dev Size : 3906770096 (1862.89 GiB 2000.27 GB)
     Array Size : 9766906880 (9314.45 GiB 10001.31 GB)
  Used Dev Size : 3906762752 (1862.89 GiB 2000.26 GB)
    Data Offset : 259072 sectors
   Super Offset : 8 sectors
   Unused Space : before=258992 sectors, after=7344 sectors
          State : clean
    Device UUID : 34198042:3d4c802b:36727b02:fdf65808

Internal Bitmap : 8 sectors from superblock
    Update Time : Sat Mar 26 18:05:00 2022
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : f8fb6b18 - correct
         Events : 256494

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAAA.. ('A' == active, '.' == missing, 'R' == replacing)
Disk sdf
/dev/sdf:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : e866cf54:90d5c74e:fe00b6e7:d25c82f4
           Name : N5550:0  (local to host N5550)
  Creation Time : Fri Oct 29 14:43:58 2021
     Raid Level : raid6
   Raid Devices : 7

 Avail Dev Size : 3906770096 (1862.89 GiB 2000.27 GB)
     Array Size : 9766906880 (9314.45 GiB 10001.31 GB)
  Used Dev Size : 3906762752 (1862.89 GiB 2000.26 GB)
    Data Offset : 259072 sectors
   Super Offset : 8 sectors
   Unused Space : before=258992 sectors, after=7344 sectors
          State : clean
    Device UUID : b2e8d640:1f21336f:88d823fe:66ef7be7

Internal Bitmap : 8 sectors from superblock
    Update Time : Wed Mar 23 14:46:56 2022
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : 15cd05bb - correct
         Events : 238681

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 4
   Array State : AAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
Disk sdg
/dev/sdg:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : e866cf54:90d5c74e:fe00b6e7:d25c82f4
           Name : N5550:0  (local to host N5550)
  Creation Time : Fri Oct 29 14:43:58 2021
     Raid Level : raid6
   Raid Devices : 7

 Avail Dev Size : 3906770096 (1862.89 GiB 2000.27 GB)
     Array Size : 9766906880 (9314.45 GiB 10001.31 GB)
  Used Dev Size : 3906762752 (1862.89 GiB 2000.26 GB)
    Data Offset : 259072 sectors
   Super Offset : 8 sectors
   Unused Space : before=258992 sectors, after=7344 sectors
          State : active
    Device UUID : 2bc06e22:49aa73e2:3cf7eb79:55df1180

Internal Bitmap : 8 sectors from superblock
    Update Time : Sat Mar 26 17:57:06 2022
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : 7f0ddb2a - correct
         Events : 256372

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 5
   Array State : AAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
Disk sdh
/dev/sdh:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : e866cf54:90d5c74e:fe00b6e7:d25c82f4
           Name : N5550:0  (local to host N5550)
  Creation Time : Fri Oct 29 14:43:58 2021
     Raid Level : raid6
   Raid Devices : 7

 Avail Dev Size : 3906770096 (1862.89 GiB 2000.27 GB)
     Array Size : 9766906880 (9314.45 GiB 10001.31 GB)
  Used Dev Size : 3906762752 (1862.89 GiB 2000.26 GB)
    Data Offset : 259072 sectors
   Super Offset : 8 sectors
   Unused Space : before=258992 sectors, after=7344 sectors
          State : clean
    Device UUID : 7af89a18:52ef08ae:dec5ad7b:75626355

Internal Bitmap : 8 sectors from superblock
    Update Time : Sat Mar 26 18:13:45 2022
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : 17d7b107 - correct
         Events : 256508

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 4
   Array State : AAA.A.. ('A' == active, '.' == missing, 'R' == replacing)

我尝试用以下命令启动 RAID

mdadm --run /dev/md0

并得到:

mdadm: failed to start array /dev/md0: Input/output error

但当我用这个 Webmin 启动它后,它向我展示了:

/dev/md0    active, FAILED, Not Started     RAID6 (Dual Distributed Parity)     7.27 TiB

9TB 为 7.27。

有什么想法可以让我的 RAID 重新恢复工作而不丢失数据吗?

我读过有关我可以将设备再次添加回 RAID 的信息,但我不确定并想先询问一下。

任何帮助,将不胜感激!

更新:我忘记其中一个设备是 /dev/sdd1 而不是 /sdd!
这里对其进行检查:

/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : e866cf54:90d5c74e:fe00b6e7:d25c82f4
           Name : N5550:0  (local to host N5550)
  Creation Time : Fri Oct 29 14:43:58 2021
     Raid Level : raid6
   Raid Devices : 7

 Avail Dev Size : 3906767872 (1862.89 GiB 2000.27 GB)
     Array Size : 9766906880 (9314.45 GiB 10001.31 GB)
  Used Dev Size : 3906762752 (1862.89 GiB 2000.26 GB)
    Data Offset : 259072 sectors
   Super Offset : 8 sectors
   Unused Space : before=258992 sectors, after=5120 sectors
          State : clean
    Device UUID : d8df004e:44ee4060:ba4d2c22:e7e6bdcb

Internal Bitmap : 8 sectors from superblock
    Update Time : Sat Mar 26 18:13:45 2022
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : 1c4e98a4 - correct
         Events : 256508

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAA.A.. ('A' == active, '.' == missing, 'R' == replacing)

这里是mdadm -D /dev/md0

/dev/md0:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 7
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 7

              Name : N5550:0  (local to host N5550)
              UUID : e866cf54:90d5c74e:fe00b6e7:d25c82f4
            Events : 256494

    Number   Major   Minor   RaidDevice

       -       8       64        -        /dev/sde
       -       8       32        -        /dev/sdc
       -       8      112        -        /dev/sdh
       -       8       80        -        /dev/sdf
       -       8       16        -        /dev/sdb
       -       8       49        -        /dev/sdd1
       -       8       96        -        /dev/sdg

答案1

看起来好像有很多驱动器被从您的阵列中踢出?如果这些驱动器中的任何一个存在读取错误(检查smartctl -a重新分配/挂起/无法纠正的扇区等),则应ddrescue在尝试任何其他类型的数据恢复之前将它们转移到新驱动器。

这是一个 7 驱动器 RAID-6,因此您至少需要 5 个驱动器来运行该阵列。目前,你只有 3 个(sdb、sdc、sdh),所以它不起作用......

如果检查每个驱动器,您会发现 1 个驱动器完全丢失(根本不包含在输出中)、1 个非常过时(sdf)、2 个稍微过时(sdg、sde)以及只有 3 个驱动器Update TimeEvents最新的。

/dev/sdf: Update Time : Wed Mar 23 14:46:56 2022 Events: 238681
/dev/sdg: Update Time : Sat Mar 26 17:57:06 2022 Events: 256372
/dev/sde: Update Time : Sat Mar 26 18:05:00 2022 Events: 256494
/dev/sdb: Update Time : Sat Mar 26 18:13:45 2022 Events: 256508
/dev/sdc: Update Time : Sat Mar 26 18:13:45 2022 Events: 256508
/dev/sdh: Update Time : Sat Mar 26 18:13:45 2022 Events: 256508

在这种情况下,您只能碰碰运气mdadm --assemble --force(忽略“非新鲜”事件计数器),仅使用 5 个最好的驱动器,忽略非常过时和丢失的驱动器。否则mdadm --创建missing也可以选择配备两个驱动器。

因此,在此组装尝试中应避免使用 sdf - 驱动器越过时,您可能遇到的文件系统不一致和数据损坏就越多。如果其他驱动器的某些扇区根本没有任何数据(读取错误),这只是最后的手段......

如果可能,请运行您的 mdadm 实验写时复制覆盖

祝你好运。

相关内容