我创建了一个包含三个设备的 raidz1-0 池。其中两个设备是通过/dev/disk/by-id
ID 添加的,不知为何我决定将其用于/dev/sdg1
第三个设备。
几年后重启后,我无法让所有三台设备再次上线。以下是当前状态:
# zpool status safe00
pool: safe00
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub repaired 0 in 2h54m with 0 errors on Sun Jan 12 03:18:13 2020
config:
NAME STATE READ WRITE CKSUM
safe00 DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
ata-ST3500418AS_9VM89VGD ONLINE 0 0 0
13759036004139463181 OFFLINE 0 0 0 was /dev/sdg1
ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1NYTHJF ONLINE 0 0 0
errors: No known data errors
该机器中的驱动器为:
# lsblk -f
NAME FSTYPE LABEL UUID MOUNTPOINT
sda
├─sda1 ext4 Ubuntu LTS 8a2a3c19-580a-474d-b248-bf0822cacab6 /
├─sda2 vfat B55A-693E /boot/efi
└─sda3 swap swap 7d1cf001-07a6-4534-9624-054d70a562d5 [SWAP]
sdb zfs_member dump 11482263899067190471
├─sdb1 zfs_member dump 866164895581740988
└─sdb9 zfs_member dump 11482263899067190471
sdc
sdd
├─sdd1 zfs_member dump 866164895581740988
└─sdd9
sde zfs_member dump 866164895581740988
├─sde1 zfs_member safe00 6143939454380723991
└─sde2 zfs_member dump 866164895581740988
sdf
├─sdf1 zfs_member dump 866164895581740988
└─sdf9
sdg
├─sdg1 zfs_member safe00 6143939454380723991
└─sdg9
sdh
├─sdh1 zfs_member safe00 6143939454380723991
└─sdh9
也就是说应该safe00
包含三个设备:sde1
、sdg
& sdh
。
只是为了得到by-id
和的映射path
:
# cd /dev/disk/by-id
# ls -la ata* | cut -b 40- | awk '{split($0, a, " "); print a[3],a[2],a[1]}' | sort -h
../../sda1 -> ata-INTEL_SSDSC2KW120H6_BTLT712507HK120GGN-part1
../../sda2 -> ata-INTEL_SSDSC2KW120H6_BTLT712507HK120GGN-part2
../../sda3 -> ata-INTEL_SSDSC2KW120H6_BTLT712507HK120GGN-part3
../../sda -> ata-INTEL_SSDSC2KW120H6_BTLT712507HK120GGN
../../sdb1 -> ata-WDC_WD20EARX-00PASB0_WD-WCAZAE573068-part1
../../sdb9 -> ata-WDC_WD20EARX-00PASB0_WD-WCAZAE573068-part9
../../sdb -> ata-WDC_WD20EARX-00PASB0_WD-WCAZAE573068
../../sdc -> ata-SAMSUNG_HD204UI_S2H7JD1ZA21911
../../sdd1 -> ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E0416553-part1
../../sdd9 -> ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E0416553-part9
../../sdd -> ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E0416553
../../sde1 -> ata-ST6000VN0033-2EE110_ZAD5S9M9-part1
../../sde2 -> ata-ST6000VN0033-2EE110_ZAD5S9M9-part2
../../sde -> ata-ST6000VN0033-2EE110_ZAD5S9M9
../../sdf1 -> ata-WDC_WD10EADS-00L5B1_WD-WCAU4C151323-part1
../../sdf9 -> ata-WDC_WD10EADS-00L5B1_WD-WCAU4C151323-part9
../../sdf -> ata-WDC_WD10EADS-00L5B1_WD-WCAU4C151323
../../sdg1 -> ata-ST3500418AS_9VM89VGD-part1
../../sdg9 -> ata-ST3500418AS_9VM89VGD-part9
../../sdg -> ata-ST3500418AS_9VM89VGD
../../sdh1 -> ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1NYTHJF-part1
../../sdh9 -> ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1NYTHJF-part9
../../sdh -> ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1NYTHJF
和 zdb (我做了一些小注释)
# zdb -C safe00
MOS Configuration:
version: 5000
name: 'safe00'
state: 0
txg: 22826770
pool_guid: 6143939454380723991
errata: 0
hostname: 'filserver'
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 6143939454380723991
children[0]:
type: 'raidz'
id: 0
guid: 9801294574244764778
nparity: 1
metaslab_array: 33
metaslab_shift: 33
ashift: 12
asize: 1500281044992
is_log: 0
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 135921832921042063
path: '/dev/disk/by-id/ata-ST3500418AS_9VM89VGD-part1'
whole_disk: 1
DTL: 58
create_txg: 4
children[1]: ### THIS CHILD USED TO BE sdg1
type: 'disk'
id: 1
guid: 13759036004139463181
path: '/dev/sdg1'
whole_disk: 0
not_present: 1 ### THIS IS sde1 NOW
DTL: 52
create_txg: 4
offline: 1
children[2]: ### THIS CHILD IS NOW sdg1
type: 'disk'
id: 2
guid: 2522190573401341943
path: '/dev/disk/by-id/ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1NYTHJF-part1'
whole_disk: 1
DTL: 57
create_txg: 4
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
space map refcount mismatch: expected 178 != actual 177
游泳池摘要safe00
:
offline: sde1 --> ata-ST6000VN0033-2EE110_ZAD5S9M9-part1 <-- this likely was sdg1 before reboot
online: sdg1 --> ata-ST3500418AS_9VM89VGD
online: sdh1 --> ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1NYTHJF
尝试将离线的设备联机:
# zpool online safe00 ata-ST6000VN0033-2EE110_ZAD5S9M9-part1
cannot online ata-ST6000VN0033-2EE110_ZAD5S9M9-part1: no such device in pool
# zpool online safe00 /dev/sde1
cannot online /dev/sde1: no such device in pool
我也尝试用真实设备替换离线设备:
# zpool replace safe00 13759036004139463181 ata-ST6000VN0033-2EE110_ZAD5S9M9-part1
invalid vdev specification
use '-f' to override the following errors:
/dev/disk/by-id/ata-ST6000VN0033-2EE110_ZAD5S9M9-part1 is part of active pool 'safe00'
# zpool replace safe00 /dev/sdg1 ata-ST6000VN0033-2EE110_ZAD5S9M9-part1
invalid vdev specification
use '-f' to override the following errors:
/dev/disk/by-id/ata-ST6000VN0033-2EE110_ZAD5S9M9-part1 is part of active pool 'safe00'
因此,我最终尝试使用其 ID 使丢失的设备上线:
# zpool online safe00 13759036004139463181
warning: device '13759036004139463181' onlined, but remains in faulted state
use 'zpool replace' to replace devices that are no longer present
令人高兴的是,这将磁盘置于故障状态并开始修复。
# zpool status safe00
pool: safe00
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-4J
scan: scrub in progress since Sun Feb 23 11:19:00 2020
14.3G scanned out of 1.09T at 104M/s, 3h0m to go
0 repaired, 1.29% done
config:
NAME STATE READ WRITE CKSUM
safe00 DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
ata-ST3500418AS_9VM89VGD ONLINE 0 0 0
13759036004139463181 FAULTED 0 0 0 was /dev/sdg1
ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1NYTHJF ONLINE 0 0 0
errors: No known data errors
我应该怎么做才能避免这种情况再次发生——如何更改 zdb 中设备的“路径”属性,以使其不依赖于 Linux 在启动时对磁盘的枚举?
答案1
最可靠的方法可能是使用 GUID 或 GPT 标签创建池,我个人认为 GPT 标签是更好的解决方案,正如2021 年为 ZFS 池指定磁盘 (vdev) 的最佳实践
data-1-sces3-3tb-Z1Y0P0DK
<pool>-<pool-id>-<disk-vendor-and-model-name>-<size-of-disk>-<disk-serial-number>
以这种方式命名将有助于您解决以下问题:
- 轻松了解定义池的拓扑。
- 轻松找到所用驱动器的供应商名称和型号名称。
- 轻松找到磁盘容量。
- 当您在 GPT 标签内包含驱动器上打印的序列号时,可以轻松识别并找到驱动器笼中的坏磁盘。
存在其他持久的方法来识别磁盘,例如使用某种 ID,但它本身不够直观,您无法仅根据其电子 ID 轻松找到磁盘,您必须自己将 ID 链接到其物理位置。
我还发现如果你想重新映射池中的磁盘,这可能会有所帮助zpool 状态中混合了 gptid 和 dev 名称:
# zpool import -d /dev/gptid tank