我的新服务器 (ubuntu22) 来自 hetzner,有 2 个 ssd 和一个额外的大 hdd,仅用于备份。最初预装了 3 个 raid,但 hdd 的最大部分无法访问。我不知道为什么,但一个 raid 包含了所有 3 个磁盘。我将其从 hdd 中删除,只创建了一个 100% /dev/sda1,然后开始收到 mdadm 错误,例如:
A DegradedArray event had been detected on md device /dev/md/1.
Faithfully yours, etc.
P.S. The /proc/mdstat file currently contains the following:
Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10]
md1 : active raid1 nvme0n1p2[0] nvme1n1p2[1]
1046528 blocks super 1.2 [3/2] [UU_]
md0 : inactive nvme0n1p1[0](S) nvme1n1p1[1](S)
67041280 blocks super 1.2
md2 : active raid5 nvme0n1p3[0] nvme1n1p3[1]
930740224 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [UU_]
bitmap: 4/4 pages [16KB], 65536KB chunk
unused devices: <none>
我猜 md0 是用于救援文件系统的,但我不确定。我用 删除了它mdadm --remove /dev/md0
,但错误仍然存在。现在的消息是:
A DegradedArray event had been detected on md device /dev/md/1.
Faithfully yours, etc.
P.S. The /proc/mdstat file currently contains the following:
Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10]
md1 : active raid1 nvme0n1p2[0] nvme1n1p2[1]
1046528 blocks super 1.2 [3/2] [UU_]
md2 : active raid5 nvme0n1p3[0] nvme1n1p3[1]
930740224 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [UU_]
bitmap: 4/4 pages [16KB], 65536KB chunk
unused devices: <none>
更多输出:
> lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
loop0 7:0 0 44.5M 1 loop /snap/certbot/2344
loop1 7:1 0 114M 1 loop /snap/core/13425
loop2 7:2 0 62M 1 loop /snap/core20/1611
loop3 7:3 0 63.2M 1 loop /snap/core20/1623
sda 8:0 0 5.5T 0 disk
└─sda1 8:1 0 5.5T 0 part /home/backup
nvme0n1 259:0 0 476.9G 0 disk
├─nvme0n1p1 259:2 0 32G 0 part
├─nvme0n1p2 259:3 0 1G 0 part
│ └─md1 9:1 0 1022M 0 raid1 /boot
└─nvme0n1p3 259:4 0 443.9G 0 part
└─md2 9:2 0 887.6G 0 raid5 /
nvme1n1 259:1 0 476.9G 0 disk
├─nvme1n1p1 259:5 0 32G 0 part
├─nvme1n1p2 259:6 0 1G 0 part
│ └─md1 9:1 0 1022M 0 raid1 /boot
└─nvme1n1p3 259:7 0 443.9G 0 part
└─md2 9:2 0 887.6G 0 raid5 /
> blkid
/dev/nvme0n1p3: UUID="826df9bd-accd-335f-14a1-2069a029de70" UUID_SUB="96648f8a-eaeb-28fb-d481-4106d12b8637" LABEL="rescue:2" TYPE="linux_raid_member" PARTUUID="5b5edee1-03"
/dev/nvme0n1p1: UUID="39a665ea-06f0-8360-a3d8-831610b52ca2" UUID_SUB="6bcef918-3006-6d2b-aeb8-0fa8973b86e1" LABEL="rescue:0" TYPE="linux_raid_member" PARTUUID="5b5edee1-01"
/dev/nvme0n1p2: UUID="b21423c4-a32f-b69b-5e42-6a413783d500" UUID_SUB="c77a1e86-e842-2d92-e8af-10ae88dc4c15" LABEL="rescue:1" TYPE="linux_raid_member" PARTUUID="5b5edee1-02"
/dev/md2: UUID="bd7e9969-8af6-49ae-b9a6-3ff7269bb962" BLOCK_SIZE="4096" TYPE="ext4"
/dev/nvme1n1p2: UUID="b21423c4-a32f-b69b-5e42-6a413783d500" UUID_SUB="52caf216-b553-cbfc-e7f8-50986a235537" LABEL="rescue:1" TYPE="linux_raid_member" PARTUUID="a69e312f-02"
/dev/nvme1n1p3: UUID="826df9bd-accd-335f-14a1-2069a029de70" UUID_SUB="72a04ab2-d87a-1c45-fbfb-556c3b93e758" LABEL="rescue:2" TYPE="linux_raid_member" PARTUUID="a69e312f-03"
/dev/nvme1n1p1: UUID="39a665ea-06f0-8360-a3d8-831610b52ca2" UUID_SUB="628713c9-8f69-e186-bbb8-ad352005c449" LABEL="rescue:0" TYPE="linux_raid_member" PARTUUID="a69e312f-01"
/dev/sda1: LABEL="datapartition" UUID="9b1b12b1-fcff-43b0-a2d2-d5e147f634c0" BLOCK_SIZE="4096" TYPE="ext4" PARTLABEL="primary" PARTUUID="a0de21cb-c74d-4aed-a6ed-2216c6a0ec5b"
/dev/md1: UUID="2f598097-fad2-4ee5-8e6f-e86a293730bb" BLOCK_SIZE="4096" TYPE="ext3"
/dev/loop1: TYPE="squashfs"
/dev/loop2: TYPE="squashfs"
/dev/loop0: TYPE="squashfs"
/dev/loop3: TYPE="squashfs"
> fdisk -l
Disk /dev/nvme0n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors
Disk model: SAMSUNG MZVLB512HBJQ-00000
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x5b5edee1
Device Boot Start End Sectors Size Id Type
/dev/nvme0n1p1 2048 67110911 67108864 32G fd Linux raid autodetect
/dev/nvme0n1p2 67110912 69208063 2097152 1G fd Linux raid autodetect
/dev/nvme0n1p3 69208064 1000213167 931005104 443.9G fd Linux raid autodetect
Disk /dev/nvme1n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors
Disk model: SAMSUNG MZVLB512HBJQ-00000
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xa69e312f
Device Boot Start End Sectors Size Id Type
/dev/nvme1n1p1 2048 67110911 67108864 32G fd Linux raid autodetect
/dev/nvme1n1p2 67110912 69208063 2097152 1G fd Linux raid autodetect
/dev/nvme1n1p3 69208064 1000213167 931005104 443.9G fd Linux raid autodetect
Disk /dev/sda: 5.46 TiB, 6001175126016 bytes, 11721045168 sectors
Disk model: HGST HUS726060AL
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 297D9FC7-CD48-4610-802B-ED8D6DF3DC2A
Device Start End Sectors Size Type
/dev/sda1 2048 11721043967 11721041920 5.5T Linux filesystem
Disk /dev/md2: 887.62 GiB, 953077989376 bytes, 1861480448 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 524288 bytes / 1048576 bytes
Disk /dev/md1: 1022 MiB, 1071644672 bytes, 2093056 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
> cat /proc/mdstat
HOMEHOST <system>
# instruct the monitoring daemon where to send mail alerts
MAILADDR root
# definitions of existing MD arrays
ARRAY /dev/md/0 metadata=1.2 UUID=39a665ea:06f08360:a3d88316:10b52ca2 name=rescue:0
ARRAY /dev/md/1 metadata=1.2 UUID=b21423c4:a32fb69b:5e426a41:3783d500 name=rescue:1
ARRAY /dev/md/2 metadata=1.2 UUID=826df9bd:accd335f:14a12069:a029de70 name=rescue:2
# This configuration was auto-generated on Wed, 07 Sep 2022 21:20:22 +0200 by mkconf
root@mail:/home/logs# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10]
md1 : active raid1 nvme0n1p2[0] nvme1n1p2[1]
1046528 blocks super 1.2 [3/2] [UU_]
md2 : active raid5 nvme0n1p3[0] nvme1n1p3[1]
930740224 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [UU_]
bitmap: 4/4 pages [16KB], 65536KB chunk
unused devices: <none>
> mdadm --detail /dev/md1
/dev/md1:
Version : 1.2
Creation Time : Wed Sep 7 22:19:42 2022
Raid Level : raid1
Array Size : 1046528 (1022.00 MiB 1071.64 MB)
Used Dev Size : 1046528 (1022.00 MiB 1071.64 MB)
Raid Devices : 3
Total Devices : 2
Persistence : Superblock is persistent
Update Time : Sun Sep 11 14:37:34 2022
State : clean, degraded
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Consistency Policy : resync
Name : rescue:1
UUID : b21423c4:a32fb69b:5e426a41:3783d500
Events : 182
Number Major Minor RaidDevice State
0 259 3 0 active sync /dev/nvme0n1p2
1 259 6 1 active sync /dev/nvme1n1p2
- 0 0 2 removed
> mdadm --detail /dev/md2
/dev/md2:
Version : 1.2
Creation Time : Wed Sep 7 22:19:42 2022
Raid Level : raid5
Array Size : 930740224 (887.62 GiB 953.08 GB)
Used Dev Size : 465370112 (443.81 GiB 476.54 GB)
Raid Devices : 3
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Mon Sep 12 00:31:56 2022
State : clean, degraded
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Consistency Policy : bitmap
Name : rescue:2
UUID : 826df9bd:accd335f:14a12069:a029de70
Events : 373931
Number Major Minor RaidDevice State
0 259 4 0 active sync /dev/nvme0n1p3
1 259 7 1 active sync /dev/nvme1n1p3
- 0 0 2 removed
我猜有些东西还没有完全清除,所以我请求帮助来了解发生了什么。
答案1
md1
本质上,您的系统/dev/md1
没问题。它是 RAID1(镜像),配置为拥有三个数据副本,但您只有两个副本。从定义上讲,这是一种降级,但数据受到保护,不会受到任何单个设备损坏的影响。如果您不需要拥有三个副本,您可以重新配置系统,让它知道两个副本就足够了:
mdadm --grow /dev/md1 -n2
这将释放状态并mdadm --detail /dev/md1
显示“最佳”。
md2
情况更糟。/dev/md2
三台设备组成 RAID5,其中一台设备丢失。它会降级,如果再有一台设备损坏,您的数据就会丢失。或者您可以将其想象成 RAID0 — 本质上它目前就是这样。如果您没有第三台设备并且不打算安装它,您可以选择 RAID1。但是,有一些注意事项。
首先是可用空间将是现在的一半如果文件系统的使用量超过一半,那么这些数据将无法存放在由这些设备构建的 RAID1 中。因此,在开始之前,请确保您的数据可以存放在单个设备上(这将是 RAID1 情况下的可用空间)。
第二点是,不幸的是,恐怕没有合理安全的方法可以即时将其转换为 RAID1。虽然 RAID 级别之间的某些转换是可能的,但我宁愿不玩降级的 RAID5。会有一些停机时间,并且您需要一些地方来临时存储数据。
因此,您的计划是将内容复制到备用存储的某个位置,完全删除 md2 并从其以前的组件创建新的 RAID1,然后将内容复制回该阵列。这是 Linux,对复制方法没有任何要求,除了它应该完全保留所有权限和扩展属性。由于这是根文件系统,您需要从一些可启动的外部媒体启动它,例如,ubuntu live 就可以了(或 gparted)。
此外,当一切就绪后,您可能需要调整 initramfs 以适应新阵列,同时仍从可移动媒体启动(Debian 需要这样做,Ubuntu 就是从 Debian 演变而来的,因此我推测该要求仍然适用)。您需要更新/etc/mdadm/mdadm.conf
根文件系统中的文件,然后重新创建 initramfs 以使其包含更新的文件。