使用 mdadm 移除 raid 后出现 DegradedArray 事件

使用 mdadm 移除 raid 后出现 DegradedArray 事件

我的新服务器 (ubuntu22) 来自 hetzner,有 2 个 ssd 和一个额外的大 hdd,仅用于备份。最初预装了 3 个 raid,但 hdd 的最大部分无法访问。我不知道为什么,但一个 raid 包含了所有 3 个磁盘。我将其从 hdd 中删除,只创建了一个 100% /dev/sda1,然后开始收到 mdadm 错误,例如:

A DegradedArray event had been detected on md device /dev/md/1.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10]
md1 : active raid1 nvme0n1p2[0] nvme1n1p2[1]
      1046528 blocks super 1.2 [3/2] [UU_]

md0 : inactive nvme0n1p1[0](S) nvme1n1p1[1](S)
      67041280 blocks super 1.2

md2 : active raid5 nvme0n1p3[0] nvme1n1p3[1]
      930740224 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [UU_]
      bitmap: 4/4 pages [16KB], 65536KB chunk

unused devices: <none>

我猜 md0 是用于救援文件系统的,但我不确定。我用 删除了它mdadm --remove /dev/md0,但错误仍然存​​在。现在的消息是:

A DegradedArray event had been detected on md device /dev/md/1.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10]
md1 : active raid1 nvme0n1p2[0] nvme1n1p2[1]
      1046528 blocks super 1.2 [3/2] [UU_]

md2 : active raid5 nvme0n1p3[0] nvme1n1p3[1]
      930740224 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [UU_]
      bitmap: 4/4 pages [16KB], 65536KB chunk

unused devices: <none>

更多输出:

> lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
loop0         7:0    0  44.5M  1 loop  /snap/certbot/2344
loop1         7:1    0   114M  1 loop  /snap/core/13425
loop2         7:2    0    62M  1 loop  /snap/core20/1611
loop3         7:3    0  63.2M  1 loop  /snap/core20/1623
sda           8:0    0   5.5T  0 disk  
└─sda1        8:1    0   5.5T  0 part  /home/backup
nvme0n1     259:0    0 476.9G  0 disk  
├─nvme0n1p1 259:2    0    32G  0 part  
├─nvme0n1p2 259:3    0     1G  0 part  
│ └─md1       9:1    0  1022M  0 raid1 /boot
└─nvme0n1p3 259:4    0 443.9G  0 part  
  └─md2       9:2    0 887.6G  0 raid5 /
nvme1n1     259:1    0 476.9G  0 disk  
├─nvme1n1p1 259:5    0    32G  0 part  
├─nvme1n1p2 259:6    0     1G  0 part  
│ └─md1       9:1    0  1022M  0 raid1 /boot
└─nvme1n1p3 259:7    0 443.9G  0 part  
  └─md2       9:2    0 887.6G  0 raid5 /




> blkid
/dev/nvme0n1p3: UUID="826df9bd-accd-335f-14a1-2069a029de70" UUID_SUB="96648f8a-eaeb-28fb-d481-4106d12b8637" LABEL="rescue:2" TYPE="linux_raid_member" PARTUUID="5b5edee1-03"
/dev/nvme0n1p1: UUID="39a665ea-06f0-8360-a3d8-831610b52ca2" UUID_SUB="6bcef918-3006-6d2b-aeb8-0fa8973b86e1" LABEL="rescue:0" TYPE="linux_raid_member" PARTUUID="5b5edee1-01"
/dev/nvme0n1p2: UUID="b21423c4-a32f-b69b-5e42-6a413783d500" UUID_SUB="c77a1e86-e842-2d92-e8af-10ae88dc4c15" LABEL="rescue:1" TYPE="linux_raid_member" PARTUUID="5b5edee1-02"
/dev/md2: UUID="bd7e9969-8af6-49ae-b9a6-3ff7269bb962" BLOCK_SIZE="4096" TYPE="ext4"
/dev/nvme1n1p2: UUID="b21423c4-a32f-b69b-5e42-6a413783d500" UUID_SUB="52caf216-b553-cbfc-e7f8-50986a235537" LABEL="rescue:1" TYPE="linux_raid_member" PARTUUID="a69e312f-02"
/dev/nvme1n1p3: UUID="826df9bd-accd-335f-14a1-2069a029de70" UUID_SUB="72a04ab2-d87a-1c45-fbfb-556c3b93e758" LABEL="rescue:2" TYPE="linux_raid_member" PARTUUID="a69e312f-03"
/dev/nvme1n1p1: UUID="39a665ea-06f0-8360-a3d8-831610b52ca2" UUID_SUB="628713c9-8f69-e186-bbb8-ad352005c449" LABEL="rescue:0" TYPE="linux_raid_member" PARTUUID="a69e312f-01"
/dev/sda1: LABEL="datapartition" UUID="9b1b12b1-fcff-43b0-a2d2-d5e147f634c0" BLOCK_SIZE="4096" TYPE="ext4" PARTLABEL="primary" PARTUUID="a0de21cb-c74d-4aed-a6ed-2216c6a0ec5b"
/dev/md1: UUID="2f598097-fad2-4ee5-8e6f-e86a293730bb" BLOCK_SIZE="4096" TYPE="ext3"
/dev/loop1: TYPE="squashfs"
/dev/loop2: TYPE="squashfs"
/dev/loop0: TYPE="squashfs"
/dev/loop3: TYPE="squashfs"




> fdisk -l
Disk /dev/nvme0n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors
Disk model: SAMSUNG MZVLB512HBJQ-00000              
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x5b5edee1

Device         Boot    Start        End   Sectors   Size Id Type
/dev/nvme0n1p1          2048   67110911  67108864    32G fd Linux raid autodetect
/dev/nvme0n1p2      67110912   69208063   2097152     1G fd Linux raid autodetect
/dev/nvme0n1p3      69208064 1000213167 931005104 443.9G fd Linux raid autodetect


Disk /dev/nvme1n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors
Disk model: SAMSUNG MZVLB512HBJQ-00000              
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xa69e312f

Device         Boot    Start        End   Sectors   Size Id Type
/dev/nvme1n1p1          2048   67110911  67108864    32G fd Linux raid autodetect
/dev/nvme1n1p2      67110912   69208063   2097152     1G fd Linux raid autodetect
/dev/nvme1n1p3      69208064 1000213167 931005104 443.9G fd Linux raid autodetect


Disk /dev/sda: 5.46 TiB, 6001175126016 bytes, 11721045168 sectors
Disk model: HGST HUS726060AL
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 297D9FC7-CD48-4610-802B-ED8D6DF3DC2A

Device     Start         End     Sectors  Size Type
/dev/sda1   2048 11721043967 11721041920  5.5T Linux filesystem


Disk /dev/md2: 887.62 GiB, 953077989376 bytes, 1861480448 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 524288 bytes / 1048576 bytes


Disk /dev/md1: 1022 MiB, 1071644672 bytes, 2093056 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes




> cat /proc/mdstat
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md/0  metadata=1.2 UUID=39a665ea:06f08360:a3d88316:10b52ca2 name=rescue:0
ARRAY /dev/md/1  metadata=1.2 UUID=b21423c4:a32fb69b:5e426a41:3783d500 name=rescue:1
ARRAY /dev/md/2  metadata=1.2 UUID=826df9bd:accd335f:14a12069:a029de70 name=rescue:2

# This configuration was auto-generated on Wed, 07 Sep 2022 21:20:22 +0200 by mkconf
root@mail:/home/logs# cat /proc/mdstat 
Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10] 
md1 : active raid1 nvme0n1p2[0] nvme1n1p2[1]
      1046528 blocks super 1.2 [3/2] [UU_]
      
md2 : active raid5 nvme0n1p3[0] nvme1n1p3[1]
      930740224 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [UU_]
      bitmap: 4/4 pages [16KB], 65536KB chunk

unused devices: <none>




> mdadm --detail /dev/md1
/dev/md1:
           Version : 1.2
     Creation Time : Wed Sep  7 22:19:42 2022
        Raid Level : raid1
        Array Size : 1046528 (1022.00 MiB 1071.64 MB)
     Used Dev Size : 1046528 (1022.00 MiB 1071.64 MB)
      Raid Devices : 3
     Total Devices : 2
       Persistence : Superblock is persistent

       Update Time : Sun Sep 11 14:37:34 2022
             State : clean, degraded 
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

Consistency Policy : resync

              Name : rescue:1
              UUID : b21423c4:a32fb69b:5e426a41:3783d500
            Events : 182

    Number   Major   Minor   RaidDevice State
       0     259        3        0      active sync   /dev/nvme0n1p2
       1     259        6        1      active sync   /dev/nvme1n1p2
       -       0        0        2      removed





> mdadm --detail /dev/md2
/dev/md2:
           Version : 1.2
     Creation Time : Wed Sep  7 22:19:42 2022
        Raid Level : raid5
        Array Size : 930740224 (887.62 GiB 953.08 GB)
     Used Dev Size : 465370112 (443.81 GiB 476.54 GB)
      Raid Devices : 3
     Total Devices : 2
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Mon Sep 12 00:31:56 2022
             State : clean, degraded 
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : rescue:2
              UUID : 826df9bd:accd335f:14a12069:a029de70
            Events : 373931

    Number   Major   Minor   RaidDevice State
       0     259        4        0      active sync   /dev/nvme0n1p3
       1     259        7        1      active sync   /dev/nvme1n1p3
       -       0        0        2      removed

我猜有些东西还没有完全清除,所以我请求帮助来了解发生了什么。

答案1

md1

本质上,您的系统/dev/md1没问题。它是 RAID1(镜像),配置为拥有三个数据副本,但您只有两个副本。从定义上讲,这是一种降级,但数据受到保护,不会受到任何单个设备损坏的影响。如果您不需要拥有三个副本,您可以重新配置系统,让它知道两个副本就足够了:

mdadm --grow /dev/md1 -n2

这将释放状态并mdadm --detail /dev/md1显示“最佳”。

md2

情况更糟。/dev/md2三台设备组成 RAID5,其中一台设备丢失。它会降级,如果再有一台设备损坏,您的数据就会丢失。或者您可以将其想象成 RAID0 — 本质上它目前就是这样。如果您没有第三台设备并且不打算安装它,您可以选择 RAID1。但是,有一些注意事项。

首先是可用空间将是现在的一半如果文件系统的使用量超过一半,那么这些数据将无法存放在由这些设备构建的 RAID1 中。因此,在开始之前,请确保您的数据可以存放在单个设备上(这将是 RAID1 情况下的可用空间)。

第二点是,不幸的是,恐怕没有合理安全的方法可以即时将其转换为 RAID1。虽然 RAID 级别之间的某些转换是可能的,但我宁愿不玩降级的 RAID5。会有一些停机时间,并且您需要一些地方来临时存储数据。

因此,您的计划是将内容复制到备用存储的某个位置,完全删除 md2 并从其以前的组件创建新的 RAID1,然后将内容复制回该阵列。这是 Linux,对复制方法没有任何要求,除了它应该完全保留所有权限和扩展属性。由于这是根文件系统,您需要从一些可启动的外部媒体启动它,例如,ubuntu live 就可以了(或 gparted)。

此外,当一切就绪后,您可能需要调整 initramfs 以适应新阵列,同时仍从可移动媒体启动(Debian 需要这样做,Ubuntu 就是从 Debian 演变而来的,因此我推测该要求仍然适用)。您需要更新/etc/mdadm/mdadm.conf根文件系统中的文件,然后重新创建 initramfs 以使其包含更新的文件。

相关内容