添加到 mdadm RAID 的单个驱动器在重启后消失

添加到 mdadm RAID 的单个驱动器在重启后消失

我已经寻找并寻找其他有同样问题的人,但这里的所有问题似乎都是关于整个 RAID 在重新启动后消失,而我只有一个成员驱动器有问题。

这是一台视频制作机器,上周(从 CentOS 7 升级到 Rocky 8 后)我们注意到视频播放在视频上产生了视觉伪影。所有视频都存储在附加的 RAID 上。

它是 RAID 60,因此两个 RAID6 每个包含 12 个 1.2TB 驱动器,然后这两个 RAID6 组合在一起形成 RAID0。这是在我开始在这里工作很久之前由外部公司设立的,但它在我的经验中一直很扎实。

通过调查这些视觉伪影,我发现根据 mdadm,其中一个 RAID6 中的一个驱动器被标记为“已删除”。没有这个驱动器,RAID 仍然可以工作,正如您对 RAID6 所期望的那样,但我怀疑它与我们所看到的伪影有关。 smartctl 显示有问题的驱动器出现故障,因此我们订购了一个新驱动器。

它今天早上到达,我从那里跟踪redhat.com 上的这些说明至此。它花了近三个小时用新驱动器重建 RAID,但它似乎有效,RAID 回来了,我没有看到伪影。

然而,我重新启动了机器,我们又回到了原点。这与我们开始时一模一样,其中一个 RAID6 显示已移除的驱动器。另外,当我查看磁盘时,我可以看到有问题的驱动器 (/dev/sdc) 已丢失其分区,或者至少它显示“1.2TB 可用空间”而不是“1.2TB Linux RAID 成员”。我想(希望)也许这是一个侥幸,今晚我再次经历了整个过程,并且发生了完全相同的事情。我第二次做的唯一不同的是/etc/mdadm.conf使用 as su 创建一个文件mdadm --examine --scan >> /etc/mdadm/mdadm.conf,但似乎没有什么区别。我现在已经清除了文件以便重新开始。

我一生都无法弄清楚发生了什么事。我对 Linux 相当有能力,但在本周之前我什至不知道 mdadm 的存在,所以我一直在尝试即时学习。这台生产机器需要在周二恢复运行,所以我反对它!我将在一夜之间再次重建 RAID,并于明天重新开始。以下是我认为您可能需要的所有输出,但如果我可以提供其他任何内容,请告诉我。

输出cat /proc/mdstat

Personalities : [raid6] [raid5] [raid4] [raid0] 
md103 : active raid0 md101[0] md102[1]
      23439351808 blocks super 1.2 512k chunks
      
md102 : active raid6 sdu[6] sdz[11] sdx[9] sdw[8] sdy[10] sdq[2] sdt[5] sdr[3] sdv[7] sds[4] sdo[0] sdp[1]
      11719808000 blocks super 1.2 level 6, 512k chunk, algorithm 2 [12/12] [UUUUUUUUUUUU]
      bitmap: 0/9 pages [0KB], 65536KB chunk

md101 : active raid6 sdk[8] sdh[5] sdg[4] sdl[9] sdf[3] sdi[6] sdj[7] sde[2] sdd[1] sdm[10] sdn[11]
      11719808000 blocks super 1.2 level 6, 512k chunk, algorithm 2 [12/11] [_UUUUUUUUUUU]
      bitmap: 1/9 pages [4KB], 65536KB chunk

mdadm --detail有问题的 RAID6的输出:

/dev/md101:
           Version : 1.2
     Creation Time : Tue Jun  8 17:37:23 2021
        Raid Level : raid6
        Array Size : 11719808000 (10.91 TiB 12.00 TB)
     Used Dev Size : 1171980800 (1117.69 GiB 1200.11 GB)
      Raid Devices : 12
     Total Devices : 11
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Fri Jan 12 19:20:16 2024
             State : clean, degraded 
    Active Devices : 11
   Working Devices : 11
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : grade1:101
              UUID : 56d9ee6d:3a9ef416:91d3b7ec:0da562b0
            Events : 1036527

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       1       8       48        1      active sync   /dev/sdd
       2       8       64        2      active sync   /dev/sde
       3       8       80        3      active sync   /dev/sdf
       4       8       96        4      active sync   /dev/sdg
       5       8      112        5      active sync   /dev/sdh
       6       8      128        6      active sync   /dev/sdi
       7       8      144        7      active sync   /dev/sdj
       8       8      160        8      active sync   /dev/sdk
       9       8      176        9      active sync   /dev/sdl
      10       8      192       10      active sync   /dev/sdm
      11       8      208       11      active sync   /dev/sdn

这可能有点矫枉过正,但fdisk -l其输出很长,因为驱动器太多了。 sda 和 sdb 是操作系统驱动器,问题驱动器 /dev/sdc 看起来有所不同,因为我已经按照准备读取它的 redhat.com 说明在其上运行了 sgdisk:

Disk /dev/sda: 894.3 GiB, 960197124096 bytes, 1875385008 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 18255C5C-FE0C-4ADB-9D13-52560809D652

Device       Start        End    Sectors   Size Type
/dev/sda1     2048    1230847    1228800   600M EFI System
/dev/sda2  1230848    3327999    2097152     1G Linux filesystem
/dev/sda3  3328000 1875384319 1872056320 892.7G Linux LVM


Disk /dev/sdb: 894.3 GiB, 960197124096 bytes, 1875385008 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: D2D8699C-C29B-4C34-B126-3667FA7B794A

Device     Start        End    Sectors   Size Type
/dev/sdb1   2048 1875384319 1875382272 894.3G Linux LVM


Disk /dev/mapper/rl-root: 70 GiB, 75161927680 bytes, 146800640 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/mapper/rl-swap: 4 GiB, 4294967296 bytes, 8388608 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdd: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdc: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: B9AB730B-09DD-44FF-BD9E-79502FB2CF5E


Disk /dev/sdh: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdg: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sde: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdl: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdi: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdm: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdo: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdp: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdk: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdq: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdf: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdn: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdr: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdj: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sds: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdt: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdu: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdv: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdw: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdx: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdy: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdz: 1.1 TiB, 1200243695616 bytes, 2344225968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/mapper/rl-home: 1.7 TiB, 1839227469824 bytes, 3592241152 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/md101: 10.9 TiB, 12001083392000 bytes, 23439616000 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 524288 bytes / 5242880 bytes


Disk /dev/md102: 10.9 TiB, 12001083392000 bytes, 23439616000 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 524288 bytes / 5242880 bytes


Disk /dev/md103: 21.8 TiB, 24001896251392 bytes, 46878703616 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 524288 bytes / 5242880 bytes

答案1

有些东西正在擦除您的 mdadm 元数据。它可能会在磁盘末尾看到 GPT 备份标头,并通过在磁盘开头重写它来帮助“修复”它,并在此过程中擦除 mdadm 元数据。

当使用整个驱动器而不是 RAID、LUKS、文件系统等分区时,这是一个典型的问题。它工作正常,直到出现问题,因为许多程序试图帮助您对驱动器进行分区。这不仅可以将一个驱动器从阵列中踢出,还可以将所有驱动器踢出......

您可以尝试使用以下命令清除分区表 GPT 标头(磁盘的开头和结尾)wipefs,然后希望不会再尝试写入新的分区表。

我更喜欢有一个分区表并使用分区而不是整个磁盘(就像您链接的教程中描述的设置)。它更标准/更不容易发生此类事故,因为大多数软件都知道单独保留分区,而这对于“未分区”驱动器来说是不行的。

但就您而言,这将涉及迁移整个设置,这可能也不是您想要的。

进行备份,特别是元数据/标头备份,其中包括驱动器序列号,以便您知道以后如何分配它们。如果您将来遇到类似的分区表事故,这可能会帮助您恢复。

相关内容