我最近两次重启时,服务器上的 Raid5 出现问题。CentOS 确实在 10-15 分钟后启动。 有什么方法可以扫描突袭并进行修复吗?我读了手册页,mdadm --scan
注意到了,但我的命令失败了。突袭目前没有任何数据或任何东西。
这可能是硬盘出现故障吗?2/3 硬盘非常虽然很旧,但是很少使用。
然后我看到了组装选项,但是团队已经组装好了,不是吗?
cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sdd[1] sde[3] sdc[0]
3906762752 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
bitmap: 0/15 pages [0KB], 65536KB chunk
和
mdadm --detail /dev/md?*
/dev/md0:
Version : 1.2
Creation Time : Sat Jul 4 00:09:25 2020
Raid Level : raid5
Array Size : 3906762752 (3725.78 GiB 4000.53 GB)
Used Dev Size : 1953381376 (1862.89 GiB 2000.26 GB)
Raid Devices : 3
Total Devices : 3
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Wed Sep 30 22:49:32 2020
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Consistency Policy : bitmap
Name : orcacomputers.orcainbox:0 (local to host orcacomputers.orcainbox)
UUID : 4ca9118c:3a557d0f:db723ff2:e8b9a521
Events : 5327
Number Major Minor RaidDevice State
0 8 32 0 active sync /dev/sdc
1 8 48 1 active sync /dev/sdd
3 8 64 2 active sync /dev/sde
一会儿,
lsblk /dev/sdf
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdf 8:80 0 7.3T 0 disk
和
mdadm --examine /dev/sdf*
mdadm: No md superblock detected on /dev/sdf
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 447.1G 0 disk
├─sda1 8:1 0 200M 0 part /boot/efi
├─sda2 8:2 0 1G 0 part /boot
└─sda3 8:3 0 446G 0 part
├─centos-root 253:0 0 50G 0 lvm /
├─centos-swap 253:1 0 31.4G 0 lvm [SWAP]
└─centos-home 253:2 0 364.5G 0 lvm /home
sdb 8:16 0 447.1G 0 disk /run/media/orca/ssd2
sdc 8:32 0 1.8T 0 disk
└─md0 9:0 0 3.7T 0 raid5 /mnt/raid5
sdd 8:48 0 1.8T 0 disk
└─md0 9:0 0 3.7T 0 raid5 /mnt/raid5
sde 8:64 0 1.8T 0 disk
└─md0 9:0 0 3.7T 0 raid5 /mnt/raid5
sdf 8:80 0 7.3T 0 disk
硬盘看起来不错,没有出现故障
/dev/sda
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1127.19.1.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
----------------------------------------------------------------------------------------
/dev/sdb
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1127.19.1.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
----------------------------------------------------------------------------------------
/dev/sdc
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1127.19.1.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
----------------------------------------------------------------------------------------
/dev/sdd
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1127.19.1.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
----------------------------------------------------------------------------------------
/dev/sde
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1127.19.1.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
----------------------------------------------------------------------------------------
/dev/sdf
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1127.19.1.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
Read Device Identity failed: scsi error unsupported field in scsi command
/dev/sdf is the combination of sdc sdd sde
以下是驱动器列表:
/dev/sda 447.1G
/dev/sdb 447.1G
/dev/sdc 1.8T
/dev/sdd 1.8T
/dev/sde 1.8T
/dev/sdf 7.3T
谢谢您的帮助。我没有设置这个 raid5。
答案1
/dev/sdf 出现故障,需要更换。缓冲区 I/O 错误来自磁盘,而不是 mdadm。
据我所知,mdadm 无法修复,因为没有什么可修复的,但可能有一个验证选项(我不认为这对您有帮助),并且,ifckurse,您可以删除故障设备并添加一个新的。
答案2
控制台的照片显示了来自 的众多 RAID 错误/dev/sdf
。 的输出/proc/mdstat
显示有问题的 RAID5 是由 、 和 构建的/dev/sdd
,/dev/sde
并且/dev/sdc
没有缺少任何组件。
因此 RAID5 没有风险。您有其他东西在使用/dev/sdf
,而这个东西将会丢失数据。(也许是未组装的 RAID 阵列?)