mdadm 阵列不会在启动时组装,需要使用设备列表手动组装

mdadm 阵列不会在启动时组装,需要使用设备列表手动组装

我运行的是 14.04 LTS,带有 5 个驱动器软件 RAID 阵列,每个驱动器有 2TB。mdadm、lvm 和 xfs。我的主要启动设备是 256GB SSD。

停电了,当电源恢复时,系统无法启动。尝试启动时,屏幕上反复滚动显示以下内容,因此我无法完成启动过程:

    Incrementally starting RAID arrays... 
    mdadm: CREATE user root not found 
    mdadm: CREATE group disk not found
    Incrementally started RAID arrays.

启动板上有一个错误(https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1335642),但似乎没有明确的解决方法 – 或者至少没有我可以轻松重复的步骤。

启动进入恢复模式,列出以下信息:

[ 2.482387] md: bind<sdb1> 
[ 2.408390] md: bind<sda1> 
[ 2.438005] md: bind<sdc1> 
[ 2.986691] Switched to clocksource tsc 
Incrementally starting RAID arrays... 
mdadm: CREATE user root not found 
mdadm: CREATE group disk not found 
[ 31.755948] md/raid:md0: device sdc1 operational as raid disk 1 
[ 31.756886] md/raid:md0: device sda1 operational as raid disk 0 
[ 31.756861] md/raid:md0: device sdb1 operational as raid disk 2 
[ 31.756115] md/raid:md0: device sdd1 operational as raid disk 3 
[ 31.756531] md/raid:md0: allocated 0kB 
[ 31.756647] md/raid:md0: raid level 5 active with 4 out of 5 devices, algorithm 2 
[ 31.756735] md0: detected capacity change from 0 to 8001591181312
mdadm: started array /dev/md0
Incrementally started RAID arrays. 
[ 31.757933] random: nonblocking pool is initialized 
[ 31.758184]  md0: unknown partition table 
[ 31.781641] bio: create slab <bio-1> at 1
Incrementally starting RAID arrays... 
mdadm: CREATE user root not found 
mdadm: CREATE group disk not found 
Incrementally started RAID arrays. 

因此,启动 Live CD 时,通过 SMART 数据,驱动器看起来都正常。如果我尝试运行,mdadm --assemble --scan则会收到以下警告:

mdadm: WARNING /dev/sde1 and /dev/sde appear to have very similar superblocks.
      If they are really different, please --zero the superblock on one
      If they are the same or overlap, please remove one from the
      DEVICE list in mdadm.conf.

阵列尚未组装。

我在这里捕获了所有 RAID 设备信息:

/dev/sda1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : d5f6a94e:185828ec:b1902148:b8793263
  Creation Time : Tue Feb 15 18:47:10 2011
     Raid Level : raid5
  Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
     Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

    Update Time : Tue Aug  2 11:43:38 2016
          State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1af33e59 - correct
         Events : 105212

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       33        0      active sync   /dev/sdc1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       65        1      active sync   /dev/sde1
   2     2       8       49        2      active sync   /dev/sdd1
   3     3       8       81        3      active sync   /dev/sdf1
   4     4       8        1        4      active sync   /dev/sda1
/dev/sdb1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : d5f6a94e:185828ec:b1902148:b8793263
  Creation Time : Tue Feb 15 18:47:10 2011
     Raid Level : raid5
  Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
     Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

    Update Time : Tue Aug  2 11:43:38 2016
          State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1af33e6d - correct
         Events : 105212

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       49        2      active sync   /dev/sdd1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       65        1      active sync   /dev/sde1
   2     2       8       49        2      active sync   /dev/sdd1
   3     3       8       81        3      active sync   /dev/sdf1
   4     4       8        1        4      active sync   /dev/sda1
/dev/sdc1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : d5f6a94e:185828ec:b1902148:b8793263
  Creation Time : Tue Feb 15 18:47:10 2011
     Raid Level : raid5
  Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
     Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

    Update Time : Tue Aug  2 11:43:38 2016
          State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1af33e7b - correct
         Events : 105212

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       65        1      active sync   /dev/sde1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       65        1      active sync   /dev/sde1
   2     2       8       49        2      active sync   /dev/sdd1
   3     3       8       81        3      active sync   /dev/sdf1
   4     4       8        1        4      active sync   /dev/sda1
/dev/sdd1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : d5f6a94e:185828ec:b1902148:b8793263
  Creation Time : Tue Feb 15 18:47:10 2011
     Raid Level : raid5
  Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
     Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

    Update Time : Tue Aug  2 11:43:38 2016
          State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1af33e8f - correct
         Events : 105212

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       81        3      active sync   /dev/sdf1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       65        1      active sync   /dev/sde1
   2     2       8       49        2      active sync   /dev/sdd1
   3     3       8       81        3      active sync   /dev/sdf1
   4     4       8        1        4      active sync   /dev/sda1
/dev/sde1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : d5f6a94e:185828ec:b1902148:b8793263
  Creation Time : Tue Feb 15 18:47:10 2011
     Raid Level : raid5
  Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
     Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

    Update Time : Tue Aug  2 11:43:38 2016
          State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1af33e41 - correct
         Events : 105212

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     4       8        1        4      active sync   /dev/sda1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       65        1      active sync   /dev/sde1
   2     2       8       49        2      active sync   /dev/sdd1
   3     3       8       81        3      active sync   /dev/sdf1
   4     4       8        1        4      active sync   /dev/sda1

原始 /etc/mdadm/mdadm.conf(没什么特别的):

# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md0 metadata=0.90 UUID=d5f6a94e:185828ec:b1902148:b8793263

# This file was auto-generated on Wed, 09 May 2012 23:34:51 -0400
# by mkconf $Id$

因此,如果我运行sudo mdadm --assemble /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1(取决于哪些驱动器是 raid 驱动器),则阵列会正确组装并且我可以访问文件。

我尝试切断所有 RAID 驱动器的电源,但系统仍然无法启动(同样的无限循环)。

我已尝试chroot在 /etc/mdadm/mdadm.conf 中定义阵列中的每个设备,然后更新 initramfs,这就是我现在的情况,但系统仍然无法启动。

以下是新内容/etc/mdadm/mdadm.conf

# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers
DEVICE /dev/sd[abcde]1

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
#ARRAY /dev/md0 metadata=0.90 UUID=d5f6a94e:185828ec:b1902148:b8793263
ARRAY /dev/md0 devices=/dev/sda1,/dev/sdb1,/dev/sdc1,/dev/sdd1,/dev/sde1

# This file was auto-generated on Wed, 09 May 2012 23:34:51 -0400
# by mkconf $Id$

我不明白是什么原因导致系统无法在启动时组装,而我可以通过指定设备手动组装。

剩下的一件看起来很奇怪的事情是,我用慢动作摄像机记录了启动过程,但没有看到/dev/sde启动/dev/sde1消息。我会调查一下,但真的不知道该调查什么。

更新 - 8 月 13 日(周六)

我一直在做更多调查。因此,执行此操作可sudo fdisk -l显示 RAID 5 中的驱动器的以下信息:

ubuntu@ubuntu:~$ sudo fdisk -l

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
81 heads, 63 sectors/track, 765633 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0xca36f687

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048  3907029167  1953513560   fd  Linux raid autodetect

WARNING: GPT (GUID Partition Table) detected on '/dev/sdc'! The util fdisk doesn't support GPT. Use GNU Parted.


Disk /dev/sdc: 2000.4 GB, 2000398933504 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029167 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 33553920 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1  3907029166  1953514583   ee  GPT
Partition 1 does not start on physical sector boundary.

Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
81 heads, 63 sectors/track, 765633 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xf66042a2

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1            2048  3907029167  1953513560   fd  Linux raid autodetect

Disk /dev/sde: 2000.4 GB, 2000398934016 bytes
81 heads, 63 sectors/track, 765633 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x2006adb2

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1            2048  3907029167  1953513560   fd  Linux raid autodetect

Disk /dev/sdf: 2000.4 GB, 2000398934016 bytes
81 heads, 63 sectors/track, 765633 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x0008b3d6

   Device Boot      Start         End      Blocks   Id  System
/dev/sdf1            2048  3907029167  1953513560   fd  Linux raid autodetect

Disk /dev/sdg: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xd46f102b

   Device Boot      Start         End      Blocks   Id  System
/dev/sdg1              64  3907029167  1953514552   fd  Linux raid autodetect

因此,显然此处的/dev/sdg1起始扇区位置与其他 RAID 分区不同。因此,下一步是检查驱动/dev/sdg器。正如您通过以下 4 个命令所看到的,mdadm 不会像检查/dev/sdg其他驱动器那样检查驱动器并检测 RAID(/dev/sda下面用作示例)。 这是否暗示了实际上出了什么问题?

ubuntu@ubuntu:~$ sudo mdadm --examine /dev/sdg
/dev/sdg:
   MBR Magic : aa55
Partition[0] :   3907029104 sectors at           64 (type fd)
ubuntu@ubuntu:~$ sudo mdadm --examine /dev/sdg1
/dev/sdg1:
      Magic : a92b4efc
    Version : 0.90.00
       UUID : d5f6a94e:185828ec:b1902148:b8793263
  Creation Time : Tue Feb 15 18:47:10 2011
     Raid Level : raid5
  Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
     Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

    Update Time : Sun Aug 14 03:04:59 2016
      State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1b029700 - correct
     Events : 105212

     Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       81        3      active sync   /dev/sdf1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       65        1      active sync   /dev/sde1
   2     2       8       49        2      active sync   /dev/sdd1
   3     3       8       81        3      active sync   /dev/sdf1
   4     4       8        1        4      active sync   /dev/sda1
ubuntu@ubuntu:~$ sudo mdadm --examine /dev/sda
/dev/sda:
      Magic : a92b4efc
    Version : 0.90.00
       UUID : d5f6a94e:185828ec:b1902148:b8793263
  Creation Time : Tue Feb 15 18:47:10 2011
     Raid Level : raid5
  Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
     Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

    Update Time : Sun Aug 14 03:04:59 2016
      State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1b0296b2 - correct
     Events : 105212

     Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     4       8        1        4      active sync   /dev/sda1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       65        1      active sync   /dev/sde1
   2     2       8       49        2      active sync   /dev/sdd1
   3     3       8       81        3      active sync   /dev/sdf1
   4     4       8        1        4      active sync   /dev/sda1
ubuntu@ubuntu:~$ sudo mdadm --examine /dev/sda1
/dev/sda1:
      Magic : a92b4efc
    Version : 0.90.00
       UUID : d5f6a94e:185828ec:b1902148:b8793263
  Creation Time : Tue Feb 15 18:47:10 2011
     Raid Level : raid5
  Used Dev Size : 1953513472 (1863.02 GiB 2000.40 GB)
     Array Size : 7814053888 (7452.06 GiB 8001.59 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

    Update Time : Sun Aug 14 03:04:59 2016
      State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1b0296b2 - correct
     Events : 105212

     Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     4       8        1        4      active sync   /dev/sda1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       65        1      active sync   /dev/sde1
   2     2       8       49        2      active sync   /dev/sdd1
   3     3       8       81        3      active sync   /dev/sdf1
   4     4       8        1        4      active sync   /dev/sda1

最后,我对运行sudo mdadm --assemble --scan -v(详细模式)感到困惑,因为它似乎发出警告,称驱动器(/dev/sdf)和第一个(也是唯一的)分区(/dev/sdf1)看起来相同,然后停止组装。请参见此处:

ubuntu@ubuntu:~$ sudo mdadm --assemble --scan -v
mdadm: looking for devices for /dev/md0
mdadm: Cannot assemble mbr metadata on /dev/sdh1
mdadm: Cannot assemble mbr metadata on /dev/sdh
mdadm: no recogniseable superblock on /dev/sdb5
mdadm: Cannot assemble mbr metadata on /dev/sdb2
mdadm: Cannot assemble mbr metadata on /dev/sdb1
mdadm: Cannot assemble mbr metadata on /dev/sdb
mdadm: no RAID superblock on /dev/sdg
mdadm: no RAID superblock on /dev/sdc1
mdadm: no RAID superblock on /dev/sdc
mdadm: cannot open device /dev/sr0: No medium found
mdadm: no RAID superblock on /dev/loop0
mdadm: no RAID superblock on /dev/ram15
mdadm: no RAID superblock on /dev/ram14
mdadm: no RAID superblock on /dev/ram13
mdadm: no RAID superblock on /dev/ram12
mdadm: no RAID superblock on /dev/ram11
mdadm: no RAID superblock on /dev/ram10
mdadm: no RAID superblock on /dev/ram9
mdadm: no RAID superblock on /dev/ram8
mdadm: no RAID superblock on /dev/ram7
mdadm: no RAID superblock on /dev/ram6
mdadm: no RAID superblock on /dev/ram5
mdadm: no RAID superblock on /dev/ram4
mdadm: no RAID superblock on /dev/ram3
mdadm: no RAID superblock on /dev/ram2
mdadm: no RAID superblock on /dev/ram1
mdadm: no RAID superblock on /dev/ram0
mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdf is identified as a member of /dev/md0, slot 1.
mdadm: WARNING /dev/sdf1 and /dev/sdf appear to have very similar superblocks.
      If they are really different, please --zero the superblock on one
      If they are the same or overlap, please remove one from the
      DEVICE list in mdadm.conf.

此时,我想知道下一步该做什么?

  1. 我是否应该从阵列中删除 /dev/sdg1,将其重建为从扇区 2048 开始,然后将其重新添加,并让阵列自行重建?如果是这样,我应该采取什么步骤?
  2. 从第 64 扇区开始有什么问题吗?一旦我能够通过指定要使用的设备从 LIVE CD 组装阵列,是否有办法确定“/dev/sdg”驱动器是否在阵列中正常工作?如果是这样,是否值得执行上面的 #1,或者是否有办法通过设备标识符等手动设置阵列中的设备?在 mdadm.conf 中指定设备不起作用。
  3. 我还应该尝试其他诊断步骤吗?

在此先感谢您的帮助!

更新 - 2016 年 9 月 23 日

因此,我在从第 64 扇区开始的驱动器上尝试了上述选项 #1。我使驱动器发生故障,将其从阵列中移除并重新分区空间。然后我将其重新添加并让它重建。我还对驱动器进行了离线 SMART 测试。所有测试都通过了,驱动器被毫无问题地重新添加到阵列中。

我不知道是什么促使我采取了下一步行动,但我尝试从 grub 菜单中选择不同的内核修订版。通过高级启动选项,我无法从 启动,3.13.0-92-generic也无法从 启动3.13.0-86-generic。它们都进入了无限循环。

但是,我可以启动3.13.0-63-generic,而且似乎每个比它更旧的内核都一样(尽管我还没有测试过所有内核)。显然,系统不是 100% 正常工作:虽然它可以带我进入 GUI,但我无法登录。我必须切换到终端并以此方式登录。但是,阵列已安装并准备就绪,Samba 运行良好。

因此,我的下一个想法是查看这些initrd图像之间有什么不同。我展开了非工作图像和工作图像,并比较了所有非二进制文件,虽然我是新手,但我没有发现任何问题。

此时,它似乎指出了内核映像之间的差异,但我对此还不是很了解,不确定下一步该怎么做。

请帮忙?

相关内容