我有一台运行 Ubuntu 的 QNAP 和 MD 软件 raid 中的驱动器,但重新启动后,服务器进入紧急模式(我通过从 fstab 中删除 raid 来恢复)。
(前面的 LED 没有指示驱动器有任何错误,但我认为这些是原始 QNAP 安装控制的软件,但我现在运行的是 Ubuntu 16.04。)
此后我尝试手动安装 md0,但无济于事。
server@server:~$ sudo mount -t ext4 /dev/md0 /home/media
mount: wrong fs type, bad option, bad superblock on /dev/md0,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so.
server@server:~$ dmesg | tail
[ 42.878727] igb 0000:0c:00.0 enp12s0: igb: enp12s0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[ 42.878998] IPv6: ADDRCONF(NETDEV_CHANGE): enp12s0: link becomes ready
[ 45.695936] pcieport 0000:03:00.0: System wakeup enabled by ACPI
[ 45.698592] pcieport 0000:03:00.0: System wakeup enabled by ACPI
[ 50.744434] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
[ 50.753617] NFSD: starting 90-second grace period (net ffffffff81ef5e80)
[ 397.457988] EXT4-fs (md0): unable to read superblock
此后我去检查驱动器,fdisk 没有显示任何相关错误。/dev/sde 是我的启动 SSD,它已使用与驱动器实际相同的较小图像进行了映像处理,因此出现了一些错误,但这并不相关。
sudo fdisk -l
Disk /dev/sda: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/sdb: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/sdc: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/sdd: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
GPT PMBR size mismatch (125045423 != 250069679) will be corrected by w(rite).
Disk /dev/sde: 119.2 GiB, 128035676160 bytes, 250069680 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 812B4B47-96F5-4815-9A7A-39420846C178
Device Start End Sectors Size Type
/dev/sde1 2048 1050623 1048576 512M EFI System
/dev/sde2 1050624 59643903 58593280 28G Linux filesystem
/dev/sde3 116881408 125044735 8163328 3.9G Linux swap
/dev/sde4 59643904 116881407 57237504 27.3G Linux filesystem
Partition table entries are not in disk order.
所以我去检查了 mdadm,它告诉我突袭处于“不活动”状态。
server@server:~$ sudo mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Raid Level : raid0
Total Devices : 4
Persistence : Superblock is persistent
State : inactive
Name : lavie-server:0 (local to host lavie-server)
UUID : 6d7fc4d9:6ca640d1:14235985:d87224f7
Events : 256957
Number Major Minor RaidDevice
- 8 0 - /dev/sda
- 8 16 - /dev/sdb
- 8 32 - /dev/sdc
- 8 48 - /dev/sdd
然后我开始检查每个驱动器(其他三个驱动器的阵列状态中似乎缺少一个,但在一个驱动器上,所有 4 个都出现了):
sudo mdadm --examine /dev/sda
/dev/sda:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 6d7fc4d9:6ca640d1:14235985:d87224f7
Name : lavie-server:0 (local to host lavie-server)
Creation Time : Wed May 10 12:13:27 2017
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
Array Size : 11720661504 (11177.69 GiB 12001.96 GB)
Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=688 sectors
State : active
Device UUID : d17b6e14:6cfa14ec:d39da457:eb30892e
Internal Bitmap : 8 sectors from superblock
Update Time : Fri Dec 28 14:35:42 2018
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : f137feca - correct
Events : 256957
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 0
Array State : AA.A ('A' == active, '.' == missing, 'R' == replacing)
(下一个)
sudo mdadm --examine /dev/sdb
/dev/sdb:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 6d7fc4d9:6ca640d1:14235985:d87224f7
Name : lavie-server:0 (local to host lavie-server)
Creation Time : Wed May 10 12:13:27 2017
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
Array Size : 11720661504 (11177.69 GiB 12001.96 GB)
Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=688 sectors
State : active
Device UUID : 7111c8f9:b25a4240:7c06be59:ef2a90b5
Internal Bitmap : 8 sectors from superblock
Update Time : Fri Dec 28 14:35:42 2018
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : b3df575f - correct
Events : 256957
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 1
Array State : AA.A ('A' == active, '.' == missing, 'R' == replacing)
下一个
sudo mdadm --examine /dev/sdc
/dev/sdc:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 6d7fc4d9:6ca640d1:14235985:d87224f7
Name : lavie-server:0 (local to host lavie-server)
Creation Time : Wed May 10 12:13:27 2017
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
Array Size : 11720661504 (11177.69 GiB 12001.96 GB)
Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=688 sectors
State : clean
Device UUID : 44b103a3:825be8ea:3d05c937:5f9dfa12
Internal Bitmap : 8 sectors from superblock
Update Time : Mon Dec 24 13:49:34 2018
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 432ce5b5 - correct
Events : 47370
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 2
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
(下一个)
sudo mdadm --examine /dev/sdd
/dev/sdd:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 6d7fc4d9:6ca640d1:14235985:d87224f7
Name : lavie-server:0 (local to host lavie-server)
Creation Time : Wed May 10 12:13:27 2017
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
Array Size : 11720661504 (11177.69 GiB 12001.96 GB)
Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=688 sectors
State : active
Device UUID : 64a2131e:910e477d:1e3f89c3:fe1fc2e7
Internal Bitmap : 8 sectors from superblock
Update Time : Fri Dec 28 14:35:42 2018
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : b12cc9e4 - correct
Events : 256957
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 3
Array State : AA.A ('A' == active, '.' == missing, 'R' == replacing)
所以我怀疑 /dev/sdc 有问题,因为它显示“干净”而不是“活动”。
LSHW 的状态:
sudo lshw -class disk -class storage
*-usb:1
description: Mass storage device
product: AS2115
vendor: ASMedia
physical id: 4
bus info: usb@1:4
logical name: scsi8
version: 0.01
serial: 00000000000000000000
capabilities: usb-2.10 scsi emulated scsi-host
configuration: driver=usb-storage speed=480Mbit/s
*-disk
description: SCSI Disk
product: 2115
vendor: ASMT
physical id: 0.0.0
bus info: scsi@8:0.0.0
logical name: /dev/sde
version: 0
serial: 00000000000000000000
size: 119GiB (128GB)
capabilities: gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=6 guid=812b4b47-96f5-4815-9a7a-39420846c178 logicalsectorsize=512 sectorsize=512
*-storage
description: SATA controller
product: Marvell Technology Group Ltd.
vendor: Marvell Technology Group Ltd.
physical id: 0
bus info: pci@0000:01:00.0
version: 11
width: 32 bits
clock: 33MHz
capabilities: storage pm msi pciexpress ahci_1.0 bus_master cap_list rom
configuration: driver=ahci latency=0
resources: irq:272 ioport:d050(size=8) ioport:d040(size=4) ioport:d030(size=8) ioport:d020(size=4) ioport:d000(size=32) memory:90b10000-90b107ff memory:90b00000-90b0ffff
*-storage
description: SATA controller
product: Marvell Technology Group Ltd.
vendor: Marvell Technology Group Ltd.
physical id: 0
bus info: pci@0000:02:00.0
version: 11
width: 32 bits
clock: 33MHz
capabilities: storage pm msi pciexpress ahci_1.0 bus_master cap_list rom
configuration: driver=ahci latency=0
resources: irq:278 ioport:c050(size=8) ioport:c040(size=4) ioport:c030(size=8) ioport:c020(size=4) ioport:c000(size=32) memory:90a10000-90a107ff memory:90a00000-90a0ffff
*-scsi:0
physical id: 1
logical name: scsi0
capabilities: emulated
*-disk
description: ATA Disk
product: WDC WD40EFRX-68W
vendor: Western Digital
physical id: 0.0.0
bus info: scsi@0:0.0.0
logical name: /dev/sda
version: 0A82
serial: WD-WCC4E1SSPZV8
size: 3726GiB (4TB)
capabilities: removable
configuration: ansiversion=5 logicalsectorsize=512 sectorsize=4096
*-medium
physical id: 0
logical name: /dev/sda
size: 3726GiB (4TB)
*-scsi:1
physical id: 2
logical name: scsi3
capabilities: emulated
*-disk
description: ATA Disk
product: WDC WD40EFRX-68W
vendor: Western Digital
physical id: 0.0.0
bus info: scsi@3:0.0.0
logical name: /dev/sdb
version: 0A82
serial: WD-WCC4E3HS69CC
size: 3726GiB (4TB)
capabilities: removable
configuration: ansiversion=5 logicalsectorsize=512 sectorsize=4096
*-medium
physical id: 0
logical name: /dev/sdb
size: 3726GiB (4TB)
*-scsi:2
physical id: 3
logical name: scsi4
capabilities: emulated
*-disk
description: ATA Disk
product: WDC WD40EFRX-68W
vendor: Western Digital
physical id: 0.0.0
bus info: scsi@4:0.0.0
logical name: /dev/sdc
version: 0A82
serial: WD-WCC4E3VNJ5R2
size: 3726GiB (4TB)
capabilities: removable
configuration: ansiversion=5 logicalsectorsize=512 sectorsize=4096
*-medium
physical id: 0
logical name: /dev/sdc
size: 3726GiB (4TB)
*-scsi:3
physical id: 4
logical name: scsi7
capabilities: emulated
*-disk
description: ATA Disk
product: WDC WD40EFRX-68W
vendor: Western Digital
physical id: 0.0.0
bus info: scsi@7:0.0.0
logical name: /dev/sdd
version: 0A82
serial: WD-WCC4E1TJ37PK
size: 3726GiB (4TB)
capabilities: removable
configuration: ansiversion=5 logicalsectorsize=512 sectorsize=4096
*-medium
physical id: 0
logical name: /dev/sdd
size: 3726GiB (4TB)
我已将驱动器状态设置为https://pastebin.com/bstnDcHe因为这篇文章已经容不下它了(超过 30,000 个字符)。
我也尝试过(我在另一篇文章中发现):
server@server:~$ sudo mdadm --stop /dev/md0
mdadm: stopped /dev/md0
server@server:~$ sudo mdadm --assemble --scan --verbose
mdadm: looking for devices for /dev/md0
mdadm: no recogniseable superblock on /dev/sde4
mdadm: no recogniseable superblock on /dev/sde3
mdadm: no recogniseable superblock on /dev/sde2
mdadm: Cannot assemble mbr metadata on /dev/sde1
mdadm: Cannot assemble mbr metadata on /dev/sde
mdadm: /dev/sdd is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdc is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdb is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sda is identified as a member of /dev/md0, slot 0.
mdadm: added /dev/sdb to /dev/md0 as 1
mdadm: added /dev/sdc to /dev/md0 as 2 (possibly out of date)
mdadm: added /dev/sdd to /dev/md0 as 3
mdadm: added /dev/sda to /dev/md0 as 0
mdadm: /dev/md0 assembled from 3 drives - not enough to start the array while not clean - consider --force.
它不起作用并且没有给我任何驱动器故障的迹象。
因此,对我来说,不太确定驱动器是否出现故障(因此应予以更换)或者我是否可以继续使用“--force”,这听起来好像如果驱动器最终仍然损坏,它可能会损坏某些东西。
答案1
/dev/sdc
似乎没有失败(它处于干净状态),但它非常不同步:
Events : 47370
与其他 3 个磁盘相比,所有磁盘均已同步:
Events : 256957
强制应该有帮助,但是为了安全并以更可控的方式进行,我会失败/dev/sdc
并仅使用 3 个健康磁盘重新启动阵列,然后添加/dev/sdc
回来(它将重新同步)。
命令如下:
sudo mdadm --manage /dev/md0 --fail /dev/sdc
sudo mdadm --manage /dev/md0 --remove /dev/sdc
sudo mdadm --assemble --scan --verbose
sudo mdadm --manage /dev/md0 --add /dev/sdc