调整大小后 mdadm 出现问题

2024-6-6 • tag-icon

我已经使用同一个阵列近 10 年了，随着价格下降，我一直在用更大的磁盘替换磁盘。

在我最近将 4 个 6TB 磁盘升级为 RAID 5 后，我遇到了一个奇怪的问题。每次重新启动时，阵列都会被检测为 4.4TB 阵列。这是我的阵列的原始大小。系统启动时，我必须运行mdadm --grow /dev/md0 --size=max并等待完全重新同步，然后系统才能恢复正常。

我不确定我做错了什么。

每次升级磁盘时我都会遵循以下说明：

mdadm --manage /dev/md0 --fail /dev/old_disk
mdadm --manage /dev/md0 --remove /dev/old_disk
mdadm --manage /dev/md0 --add /dev/new_disk

我对阵列中的每个磁盘都执行了此操作，并等待阵列正常运行，然后再继续处理下一个磁盘。一旦它们全部替换为更大的磁盘，我就运行mdadm --grow /dev/md0 --size=max文件系统并调整其大小。在上次升级中，我必须在 ext4 中启用 64 位才能获得超过 16TB 的容量。这是唯一的区别。

这是首次启动时的 fdisk -l 和 /proc/mdstat 输出：

$ fdisk -l
Disk /dev/md0: 4.4 TiB, 4809380659200 bytes, 9393321600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 65536 bytes / 196608 bytes

$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md0 : active raid5 sdc1[3] sdb1[1] sdd1[0] sda1[2]
      4696660800 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]

unused devices: <none>

运行后mdadm --grow /dev/md0 --size=max

$ fdisk -l
Disk /dev/md0: 16.4 TiB, 18003520192512 bytes, 35163125376 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 65536 bytes / 196608 bytes

$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md0 : active raid5 sdc1[3] sdb1[1] sdd1[0] sda1[2]
      17581562688 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
      [=====>...............]  resync = 29.9% (1757854304/5860520896) finish=333.2min speed=205205K/sec

unused devices: <none>

$ sudo mdadm --detail /dev/md0
/dev/md0:
           Version : 0.90
     Creation Time : Fri Dec 24 19:32:21 2010
        Raid Level : raid5
        Array Size : 17581562688 (16767.08 GiB 18003.52 GB)
     Used Dev Size : 18446744073709551615
      Raid Devices : 4
     Total Devices : 4
   Preferred Minor : 0
       Persistence : Superblock is persistent

       Update Time : Thu Apr 16 23:26:41 2020
             State : clean, resyncing 
    Active Devices : 4
   Working Devices : 4
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 64K

Consistency Policy : resync

     Resync Status : 33% complete

              UUID : 5cae35da:cd710f9e:e368bf24:bd0fce41 (local to host ubuntu)
            Events : 0.255992

    Number   Major   Minor   RaidDevice State
       0       8       49        0      active sync   /dev/sdd1
       1       8       17        1      active sync   /dev/sdb1
       2       8        1        2      active sync   /dev/sda1
       3       8       33        3      active sync   /dev/sdc1

/etc/mdadm/mdadm.conf：

# mdadm.conf
#
# !NB! Run update-initramfs -u after updating this file.
# !NB! This will ensure that initramfs has an uptodate copy.
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR jrebeiro

# definitions of existing MD arrays
ARRAY /dev/md0 metadata=0.90 UUID=5cae35da:cd710f9e:e368bf24:bd0fce41

# This configuration was auto-generated on Mon, 20 May 2019 04:28:45 +0000 by mkconf

我在 Ubuntu 18.04.4 64 位上运行 mdadm - v4.1-rc1 - 2018-03-22。我对这个完全迷失了。

更新1：

我已按照建议将元数据更新为 1.0，现在阵列又恢复到 5TB，并且不会增长到超过该点。

$ cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md0 : active raid5 sdc1[3] sdb1[1] sdd1[0] sda1[2]
      17581562688 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]

unused devices: <none>

$ sudo umount /mnt/array 
$ sudo mdadm --stop /dev/md0
mdadm: stopped /dev/md0
$ cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
unused devices: <none>
$ sudo mdadm --assemble /dev/md0 --update=metadata /dev/sd[abcd]1
mdadm: /dev/md0 has been started with 4 drives.
$ cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md0 : active raid5 sdd1[0] sdc1[3] sda1[2] sdb1[1]
      4696660800 blocks super 1.0 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]

unused devices: <none>
$ sudo mdadm --detail /dev/md0
/dev/md0:
           Version : 1.0
     Creation Time : Fri Dec 24 19:32:22 2010
        Raid Level : raid5
        Array Size : 4696660800 (4479.08 GiB 4809.38 GB)
     Used Dev Size : 1565553600 (1493.03 GiB 1603.13 GB)
      Raid Devices : 4
     Total Devices : 4
       Persistence : Superblock is persistent

       Update Time : Fri Apr 17 12:41:29 2020
             State : clean 
    Active Devices : 4
   Working Devices : 4
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 64K

Consistency Policy : resync

              Name : 0
              UUID : 5cae35da:cd710f9e:e368bf24:bd0fce41
            Events : 0

    Number   Major   Minor   RaidDevice State
       0       8       49        0      active sync   /dev/sdd1
       1       8       17        1      active sync   /dev/sdb1
       2       8        1        2      active sync   /dev/sda1
       3       8       33        3      active sync   /dev/sdc1
$ sudo mdadm --grow /dev/md0 --size=max
mdadm: component size of /dev/md0 unchanged at 1565553600K

答案1

您正在使用过时的版本 0.90 元数据。

从mdadm联机帮助页：

          0, 0.90
                 Use the original 0.90 format superblock.  This format
                 limits arrays to 28 component devices and limits
                 component devices of levels 1 and greater to 2 terabytes.

                 It is also possible for there to be confusion about
                 whether the superblock applies to a whole device or just
                 the last partition, if that partition starts on a 64K
                 boundary.

您确实不希望每个驱动器有 2 TB 的限制。

考虑--update=metadata在装配上使用。

          The metadata option only works on v0.90 metadata arrays and will
          convert them to v1.0 metadata.  The array must not be dirty
          (i.e. it must not need a sync) and it must not have a write-
          intent bitmap.

最好迁移到 1.2 元数据（位于距开始 4K 的位置，而不是距驱动器末尾约 64K 的位置），但这会涉及更多内容，因为所有数据都必须重新定位。

答案1

相关内容