我有一台专用服务器,由两块硬盘组成,每块硬盘有 5 个分区。
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 1.8T 0 disk
├─sda1 8:1 0 1M 0 part
├─sda2 8:2 0 488M 0 part /boot/efi
├─sda3 8:3 0 7.6G 0 part
│ └─md0 9:0 0 15.3G 0 raid0 [SWAP]
├─sda4 8:4 0 977M 0 part
└─sda5 8:5 0 1.8T 0 part
sdb 8:16 0 1.8T 0 disk
├─sdb1 8:17 0 1M 0 part
├─sdb2 8:18 0 488M 0 part
├─sdb3 8:19 0 7.6G 0 part
│ └─md0 9:0 0 15.3G 0 raid0 [SWAP]
├─sdb4 8:20 0 977M 0 part
│ └─md1 9:1 0 976.4M 0 raid1 /boot
└─sdb5 8:21 0 1.8T 0 part
└─md2 9:2 0 1.8T 0 raid1 /
由于无法识别磁盘而导致服务器无法重新启动后,检查我的 raid 配置时,我注意到 /dev/md2(似乎包含操作系统和所有数据)有一个磁盘已被移除,并且只显示 /dev/sdb5 为活动磁盘,第二个磁盘已被移除。
sudo mdadm --detail /dev/md2
/dev/md2:
Version : 1.2
Creation Time : Fri Jan 3 09:18:45 2020
Raid Level : raid1
Array Size : 1943880704 (1853.83 GiB 1990.53 GB)
Used Dev Size : 1943880704 (1853.83 GiB 1990.53 GB)
Raid Devices : 2
Total Devices : 1
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Sun Sep 17 09:11:34 2023
State : active, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Consistency Policy : bitmap
Name : srv10135:2 (local to host srv10135)
UUID : 2dcfef18:2714aa4a:7a708454:42eb8813
Events : 219026
Number Major Minor RaidDevice State
- 0 0 0 removed
1 8 21 1 active sync /dev/sdb5
检查两个分区(/dev/sda5 和 /dev/sdb5),我发现它们都属于同一个 RAID,并且被标记为活动状态
sudo mdadm --examine /dev/sda5
/dev/sda5:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 2dcfef18:2714aa4a:7a708454:42eb8813
Name : srv10135:2 (local to host srv10135)
Creation Time : Fri Jan 3 09:18:45 2020
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 3887761408 sectors (1853.83 GiB 1990.53 GB)
Array Size : 1943880704 KiB (1853.83 GiB 1990.53 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=0 sectors
State : active
Device UUID : d2b34f96:e2f0a43b:d3e85f43:b8cf7ea4
Internal Bitmap : 8 sectors from superblock
Update Time : Sat Sep 16 11:19:49 2023
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 27e920f0 - correct
Events : 189131
Device Role : Active device 0
Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
sudo mdadm --examine /dev/sdb5
/dev/sdb5:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 2dcfef18:2714aa4a:7a708454:42eb8813
Name : srv10135:2 (local to host srv10135)
Creation Time : Fri Jan 3 09:18:45 2020
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 3887761408 sectors (1853.83 GiB 1990.53 GB)
Array Size : 1943880704 KiB (1853.83 GiB 1990.53 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=0 sectors
State : active
Device UUID : 78918ed5:59a78605:df84fcf7:91ba926b
Internal Bitmap : 8 sectors from superblock
Update Time : Sun Sep 17 09:16:06 2023
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : acbce60c - correct
Events : 219236
Device Role : Active device 1
Array State : .A ('A' == active, '.' == missing, 'R' == replacing)
现在我想知道mdadm --detail /dev/md2
RAID 成员是否显示了错误的信息,或者 /dev/sda5 是否真的在 RAID 中丢失,以及是否可以安全地将其重新添加到 RAID 中
答案1
一般来说,/proc/mdstat
显示真实的当前 RAID 状态。
sda5
确实从 RAID 阵列中缺失:请注意其事件计数远低于sdb5
。还请注意如何sdb5
查看阵列状态.A
- 即:第一个磁盘显示为丢失。
您需要重新添加sda5
到阵列,发出类似于mdadm /dev/md2 -a /dev/sda5
(三重检查此命令前发行它)。