我将服务器 (SuperMicro X11-SSM-F,LSI SAS 9211-8i) 从 Ubuntu 18.04 升级到 20.04。该服务器有 2 个 zpool,一个由单个 WD Red 10 TB(下载池)组成,另一个由 8 个 WD Red 10TB 和 2 个 Seagate IronWolf 8TB 组成,排列成 5x2 个镜像(主池)。池是使用/dev/disk/by-id
引用创建的,以便在重新启动时保持稳定。池会定期清理,最后一次清理是在几周前,没有显示任何错误。
当我更新到 Ubuntu 20.04 后重新启动时,第二个池(masterpool)消失了。运行后zfs import
,它重新导入了它,但使用了sdX
大多数磁盘的引用(WD Reds,但不是 Seagates)。此外,具有单个 WD Red 的池很好,并且引用了其磁盘 by-id。masterpool 的输出zpool status
看起来像这样(这是来自内存):
NAME STATE READ WRITE CKSUM
masterpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sdb ONLINE 0 0 0
sdk ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
sdi ONLINE 0 0 0
sdf ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
sdd ONLINE 0 0 0
sde ONLINE 0 0 0
mirror-3 ONLINE 0 0 0
sdh ONLINE 0 0 0
sdc ONLINE 0 0 0
mirror-4 ONLINE 0 0 0
ata-ST8000VN0022-2EL112_ZA17FZXF ONLINE 0 0 0
ata-ST8000VN0022-2EL112_ZA17H5D3 ONLINE 0 0 0
这并不理想,因为这些标识符不稳定,所以在网上查看了一下之后,我重新导出了池,然后运行zpool import -d /dev/disk/by-id masterpool
。
但是现在,zpool 告诉我存在校验和错误:
NAME STATE READ WRITE CKSUM
masterpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
wwn-0x5000cca26af27d8b ONLINE 0 0 2
wwn-0x5000cca273ee8907 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
wwn-0x5000cca26aeb9280 ONLINE 0 0 8
wwn-0x5000cca273eeaed7 ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
wwn-0x5000cca273c21a05 ONLINE 0 0 0
wwn-0x5000cca267eaa17a ONLINE 0 0 0
mirror-3 ONLINE 0 0 0
wwn-0x5000cca26af7e655 ONLINE 0 0 0
wwn-0x5000cca273c099dd ONLINE 0 0 0
mirror-4 ONLINE 0 0 0
ata-ST8000VN0022-2EL112_ZA17FZXF ONLINE 0 0 0
ata-ST8000VN0022-2EL112_ZA17H5D3 ONLINE 0 0 0
因此,我正在运行清理,并且 zfs 发现了更多校验和错误:
pool: masterpool
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-9P
scan: scrub in progress since Fri May 22 21:47:34 2020
27.1T scanned at 600M/s, 27.0T issued at 597M/s, 31.1T total
112K repaired, 86.73% done, 0 days 02:00:45 to go
config:
NAME STATE READ WRITE CKSUM
masterpool DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
wwn-0x5000cca26af27d8b DEGRADED 0 0 15 too many errors (repairing)
wwn-0x5000cca273ee8907 ONLINE 0 0 0
mirror-1 DEGRADED 0 0 0
wwn-0x5000cca26aeb9280 DEGRADED 0 0 18 too many errors (repairing)
wwn-0x5000cca273eeaed7 ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
wwn-0x5000cca273c21a05 ONLINE 0 0 0
wwn-0x5000cca267eaa17a ONLINE 0 0 0
mirror-3 ONLINE 0 0 0
wwn-0x5000cca26af7e655 ONLINE 0 0 0
wwn-0x5000cca273c099dd ONLINE 0 0 0
mirror-4 ONLINE 0 0 0
ata-ST8000VN0022-2EL112_ZA17FZXF ONLINE 0 0 0
ata-ST8000VN0022-2EL112_ZA17H5D3 ONLINE 0 0 0
奇怪的是,smartctl 没有显示智能监控数据中的任何异常(两个磁盘的输出类似,只是显示一个):
$ sudo smartctl /dev/disk/by-id/wwn-0x5000cca26aeb9280 -a
...
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0
2 Throughput_Performance 0x0004 129 129 054 Old_age Offline - 112
3 Spin_Up_Time 0x0007 153 153 024 Pre-fail Always - 431 (Average 430)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 31
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000a 100 100 067 Old_age Always - 0
8 Seek_Time_Performance 0x0004 128 128 020 Old_age Offline - 18
9 Power_On_Hours 0x0012 098 098 000 Old_age Always - 15474
10 Spin_Retry_Count 0x0012 100 100 060 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 31
22 Helium_Level 0x0023 100 100 025 Pre-fail Always - 100
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 664
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 664
194 Temperature_Celsius 0x0002 158 158 000 Old_age Always - 41 (Min/Max 16/41)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 19 -
# 2 Short offline Completed without error 00% 0 -
...
另外,我注意到中的许多别名/dev/disk/by-id
都消失了(ata-*
除 cloudpool 中的唯一别名外,WD Reds 的所有别名都消失了):
# ls /dev/disk/by-id/ -l
total 0
lrwxrwxrwx 1 root root 9 May 22 23:19 ata-Samsung_SSD_850_EVO_500GB_S2RANX0H608885H -> ../../sda
lrwxrwxrwx 1 root root 10 May 22 23:19 ata-Samsung_SSD_850_EVO_500GB_S2RANX0H608885H-part1 -> ../../sda1
lrwxrwxrwx 1 root root 9 May 23 01:28 ata-ST8000VN0022-2EL112_ZA17FZXF -> ../../sdc
lrwxrwxrwx 1 root root 10 May 23 01:28 ata-ST8000VN0022-2EL112_ZA17FZXF-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 May 23 01:28 ata-ST8000VN0022-2EL112_ZA17FZXF-part9 -> ../../sdc9
lrwxrwxrwx 1 root root 9 May 23 01:16 ata-ST8000VN0022-2EL112_ZA17H5D3 -> ../../sdb
lrwxrwxrwx 1 root root 10 May 23 01:16 ata-ST8000VN0022-2EL112_ZA17H5D3-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 May 23 01:16 ata-ST8000VN0022-2EL112_ZA17H5D3-part9 -> ../../sdb9
lrwxrwxrwx 1 root root 9 May 22 23:21 ata-WDC_WD100EFAX-68LHPN0_2YG1R7PD -> ../../sdd
lrwxrwxrwx 1 root root 10 May 22 23:21 ata-WDC_WD100EFAX-68LHPN0_2YG1R7PD-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 May 22 23:21 ata-WDC_WD100EFAX-68LHPN0_2YG1R7PD-part9 -> ../../sdd9
lrwxrwxrwx 1 root root 9 May 22 23:19 scsi-0ATA_Samsung_SSD_850_S2RANX0H608885H -> ../../sda
lrwxrwxrwx 1 root root 10 May 22 23:19 scsi-0ATA_Samsung_SSD_850_S2RANX0H608885H-part1 -> ../../sda1
lrwxrwxrwx 1 root root 9 May 23 01:28 scsi-0ATA_ST8000VN0022-2EL_ZA17FZXF -> ../../sdc
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-0ATA_ST8000VN0022-2EL_ZA17FZXF-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-0ATA_ST8000VN0022-2EL_ZA17FZXF-part9 -> ../../sdc9
lrwxrwxrwx 1 root root 9 May 23 01:16 scsi-0ATA_ST8000VN0022-2EL_ZA17H5D3 -> ../../sdb
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-0ATA_ST8000VN0022-2EL_ZA17H5D3-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-0ATA_ST8000VN0022-2EL_ZA17H5D3-part9 -> ../../sdb9
lrwxrwxrwx 1 root root 9 May 22 23:21 scsi-0ATA_WDC_WD100EFAX-68_2YG1R7PD -> ../../sdd
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-0ATA_WDC_WD100EFAX-68_2YG1R7PD-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-0ATA_WDC_WD100EFAX-68_2YG1R7PD-part9 -> ../../sdd9
lrwxrwxrwx 1 root root 9 May 22 23:19 scsi-1ATA_Samsung_SSD_850_EVO_500GB_S2RANX0H608885H -> ../../sda
lrwxrwxrwx 1 root root 10 May 22 23:19 scsi-1ATA_Samsung_SSD_850_EVO_500GB_S2RANX0H608885H-part1 -> ../../sda1
lrwxrwxrwx 1 root root 9 May 23 01:28 scsi-1ATA_ST8000VN0022-2EL112_ZA17FZXF -> ../../sdc
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-1ATA_ST8000VN0022-2EL112_ZA17FZXF-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-1ATA_ST8000VN0022-2EL112_ZA17FZXF-part9 -> ../../sdc9
lrwxrwxrwx 1 root root 9 May 23 01:16 scsi-1ATA_ST8000VN0022-2EL112_ZA17H5D3 -> ../../sdb
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-1ATA_ST8000VN0022-2EL112_ZA17H5D3-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-1ATA_ST8000VN0022-2EL112_ZA17H5D3-part9 -> ../../sdb9
lrwxrwxrwx 1 root root 9 May 22 23:21 scsi-1ATA_WDC_WD100EFAX-68LHPN0_2YG1R7PD -> ../../sdd
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-1ATA_WDC_WD100EFAX-68LHPN0_2YG1R7PD-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-1ATA_WDC_WD100EFAX-68LHPN0_2YG1R7PD-part9 -> ../../sdd9
lrwxrwxrwx 1 root root 9 May 23 01:28 scsi-35000c500a2e631c6 -> ../../sdc
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-35000c500a2e631c6-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-35000c500a2e631c6-part9 -> ../../sdc9
lrwxrwxrwx 1 root root 9 May 23 01:16 scsi-35000c500a2edebe0 -> ../../sdb
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-35000c500a2edebe0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-35000c500a2edebe0-part9 -> ../../sdb9
lrwxrwxrwx 1 root root 9 May 23 00:38 scsi-35000cca267eaa17a -> ../../sdg
lrwxrwxrwx 1 root root 10 May 23 00:38 scsi-35000cca267eaa17a-part1 -> ../../sdg1
lrwxrwxrwx 1 root root 10 May 23 00:38 scsi-35000cca267eaa17a-part9 -> ../../sdg9
lrwxrwxrwx 1 root root 9 May 23 01:20 scsi-35000cca26aeb9280 -> ../../sdl
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-35000cca26aeb9280-part1 -> ../../sdl1
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-35000cca26aeb9280-part9 -> ../../sdl9
lrwxrwxrwx 1 root root 9 May 23 01:20 scsi-35000cca26af27d8b -> ../../sdk
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-35000cca26af27d8b-part1 -> ../../sdk1
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-35000cca26af27d8b-part9 -> ../../sdk9
lrwxrwxrwx 1 root root 9 May 23 02:35 scsi-35000cca26af7e655 -> ../../sdi
lrwxrwxrwx 1 root root 10 May 23 02:35 scsi-35000cca26af7e655-part1 -> ../../sdi1
lrwxrwxrwx 1 root root 10 May 23 02:35 scsi-35000cca26af7e655-part9 -> ../../sdi9
lrwxrwxrwx 1 root root 9 May 23 00:35 scsi-35000cca273c099dd -> ../../sdf
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-35000cca273c099dd-part1 -> ../../sdf1
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-35000cca273c099dd-part9 -> ../../sdf9
lrwxrwxrwx 1 root root 9 May 22 23:21 scsi-35000cca273c0c7e3 -> ../../sdd
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-35000cca273c0c7e3-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-35000cca273c0c7e3-part9 -> ../../sdd9
lrwxrwxrwx 1 root root 9 May 23 03:01 scsi-35000cca273c21a05 -> ../../sdj
lrwxrwxrwx 1 root root 10 May 23 03:01 scsi-35000cca273c21a05-part1 -> ../../sdj1
lrwxrwxrwx 1 root root 10 May 23 03:01 scsi-35000cca273c21a05-part9 -> ../../sdj9
lrwxrwxrwx 1 root root 9 May 23 00:35 scsi-35000cca273ee8907 -> ../../sde
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-35000cca273ee8907-part1 -> ../../sde1
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-35000cca273ee8907-part9 -> ../../sde9
lrwxrwxrwx 1 root root 9 May 23 00:04 scsi-35000cca273eeaed7 -> ../../sdh
lrwxrwxrwx 1 root root 10 May 23 00:04 scsi-35000cca273eeaed7-part1 -> ../../sdh1
lrwxrwxrwx 1 root root 10 May 23 00:04 scsi-35000cca273eeaed7-part9 -> ../../sdh9
lrwxrwxrwx 1 root root 9 May 22 23:19 scsi-35002538d40f8ba4c -> ../../sda
lrwxrwxrwx 1 root root 10 May 22 23:19 scsi-35002538d40f8ba4c-part1 -> ../../sda1
lrwxrwxrwx 1 root root 9 May 22 23:19 scsi-SATA_Samsung_SSD_850_S2RANX0H608885H -> ../../sda
lrwxrwxrwx 1 root root 10 May 22 23:19 scsi-SATA_Samsung_SSD_850_S2RANX0H608885H-part1 -> ../../sda1
lrwxrwxrwx 1 root root 9 May 23 01:28 scsi-SATA_ST8000VN0022-2EL_ZA17FZXF -> ../../sdc
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-SATA_ST8000VN0022-2EL_ZA17FZXF-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-SATA_ST8000VN0022-2EL_ZA17FZXF-part9 -> ../../sdc9
lrwxrwxrwx 1 root root 9 May 23 01:16 scsi-SATA_ST8000VN0022-2EL_ZA17H5D3 -> ../../sdb
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-SATA_ST8000VN0022-2EL_ZA17H5D3-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-SATA_ST8000VN0022-2EL_ZA17H5D3-part9 -> ../../sdb9
lrwxrwxrwx 1 root root 9 May 23 01:20 scsi-SATA_WDC_WD100EFAX-68_2TK2VELD -> ../../sdl
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-SATA_WDC_WD100EFAX-68_2TK2VELD-part1 -> ../../sdl1
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-SATA_WDC_WD100EFAX-68_2TK2VELD-part9 -> ../../sdl9
lrwxrwxrwx 1 root root 9 May 23 01:20 scsi-SATA_WDC_WD100EFAX-68_2TKL26ZD -> ../../sdk
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-SATA_WDC_WD100EFAX-68_2TKL26ZD-part1 -> ../../sdk1
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-SATA_WDC_WD100EFAX-68_2TKL26ZD-part9 -> ../../sdk9
lrwxrwxrwx 1 root root 9 May 23 02:35 scsi-SATA_WDC_WD100EFAX-68_2TKYZ3ND -> ../../sdi
lrwxrwxrwx 1 root root 10 May 23 02:35 scsi-SATA_WDC_WD100EFAX-68_2TKYZ3ND-part1 -> ../../sdi1
lrwxrwxrwx 1 root root 10 May 23 02:35 scsi-SATA_WDC_WD100EFAX-68_2TKYZ3ND-part9 -> ../../sdi9
lrwxrwxrwx 1 root root 9 May 23 00:35 scsi-SATA_WDC_WD100EFAX-68_2YG19ZMD -> ../../sdf
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-SATA_WDC_WD100EFAX-68_2YG19ZMD-part1 -> ../../sdf1
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-SATA_WDC_WD100EFAX-68_2YG19ZMD-part9 -> ../../sdf9
lrwxrwxrwx 1 root root 9 May 22 23:21 scsi-SATA_WDC_WD100EFAX-68_2YG1R7PD -> ../../sdd
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-SATA_WDC_WD100EFAX-68_2YG1R7PD-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-SATA_WDC_WD100EFAX-68_2YG1R7PD-part9 -> ../../sdd9
lrwxrwxrwx 1 root root 9 May 23 03:01 scsi-SATA_WDC_WD100EFAX-68_2YG4MA0D -> ../../sdj
lrwxrwxrwx 1 root root 10 May 23 03:01 scsi-SATA_WDC_WD100EFAX-68_2YG4MA0D-part1 -> ../../sdj1
lrwxrwxrwx 1 root root 10 May 23 03:01 scsi-SATA_WDC_WD100EFAX-68_2YG4MA0D-part9 -> ../../sdj9
lrwxrwxrwx 1 root root 9 May 23 00:35 scsi-SATA_WDC_WD100EFAX-68_2YK9BHKD -> ../../sde
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-SATA_WDC_WD100EFAX-68_2YK9BHKD-part1 -> ../../sde1
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-SATA_WDC_WD100EFAX-68_2YK9BHKD-part9 -> ../../sde9
lrwxrwxrwx 1 root root 9 May 23 00:04 scsi-SATA_WDC_WD100EFAX-68_2YK9PKUD -> ../../sdh
lrwxrwxrwx 1 root root 10 May 23 00:04 scsi-SATA_WDC_WD100EFAX-68_2YK9PKUD-part1 -> ../../sdh1
lrwxrwxrwx 1 root root 10 May 23 00:04 scsi-SATA_WDC_WD100EFAX-68_2YK9PKUD-part9 -> ../../sdh9
lrwxrwxrwx 1 root root 9 May 23 00:38 scsi-SATA_WDC_WD100EFAX-68_JEK0T76Z -> ../../sdg
lrwxrwxrwx 1 root root 10 May 23 00:38 scsi-SATA_WDC_WD100EFAX-68_JEK0T76Z-part1 -> ../../sdg1
lrwxrwxrwx 1 root root 10 May 23 00:38 scsi-SATA_WDC_WD100EFAX-68_JEK0T76Z-part9 -> ../../sdg9
lrwxrwxrwx 1 root root 9 May 23 01:28 wwn-0x5000c500a2e631c6 -> ../../sdc
lrwxrwxrwx 1 root root 10 May 23 01:28 wwn-0x5000c500a2e631c6-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 May 23 01:28 wwn-0x5000c500a2e631c6-part9 -> ../../sdc9
lrwxrwxrwx 1 root root 9 May 23 01:16 wwn-0x5000c500a2edebe0 -> ../../sdb
lrwxrwxrwx 1 root root 10 May 23 01:16 wwn-0x5000c500a2edebe0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 May 23 01:16 wwn-0x5000c500a2edebe0-part9 -> ../../sdb9
lrwxrwxrwx 1 root root 9 May 23 00:38 wwn-0x5000cca267eaa17a -> ../../sdg
lrwxrwxrwx 1 root root 10 May 23 00:38 wwn-0x5000cca267eaa17a-part1 -> ../../sdg1
lrwxrwxrwx 1 root root 10 May 23 00:38 wwn-0x5000cca267eaa17a-part9 -> ../../sdg9
lrwxrwxrwx 1 root root 9 May 23 01:20 wwn-0x5000cca26aeb9280 -> ../../sdl
lrwxrwxrwx 1 root root 10 May 23 01:20 wwn-0x5000cca26aeb9280-part1 -> ../../sdl1
lrwxrwxrwx 1 root root 10 May 23 01:20 wwn-0x5000cca26aeb9280-part9 -> ../../sdl9
lrwxrwxrwx 1 root root 9 May 23 01:20 wwn-0x5000cca26af27d8b -> ../../sdk
lrwxrwxrwx 1 root root 10 May 23 01:20 wwn-0x5000cca26af27d8b-part1 -> ../../sdk1
lrwxrwxrwx 1 root root 10 May 23 01:20 wwn-0x5000cca26af27d8b-part9 -> ../../sdk9
lrwxrwxrwx 1 root root 9 May 23 02:35 wwn-0x5000cca26af7e655 -> ../../sdi
lrwxrwxrwx 1 root root 10 May 23 02:35 wwn-0x5000cca26af7e655-part1 -> ../../sdi1
lrwxrwxrwx 1 root root 10 May 23 02:35 wwn-0x5000cca26af7e655-part9 -> ../../sdi9
lrwxrwxrwx 1 root root 9 May 23 00:35 wwn-0x5000cca273c099dd -> ../../sdf
lrwxrwxrwx 1 root root 10 May 23 00:35 wwn-0x5000cca273c099dd-part1 -> ../../sdf1
lrwxrwxrwx 1 root root 10 May 23 00:35 wwn-0x5000cca273c099dd-part9 -> ../../sdf9
lrwxrwxrwx 1 root root 9 May 22 23:21 wwn-0x5000cca273c0c7e3 -> ../../sdd
lrwxrwxrwx 1 root root 10 May 22 23:21 wwn-0x5000cca273c0c7e3-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 May 22 23:21 wwn-0x5000cca273c0c7e3-part9 -> ../../sdd9
lrwxrwxrwx 1 root root 9 May 23 03:01 wwn-0x5000cca273c21a05 -> ../../sdj
lrwxrwxrwx 1 root root 10 May 23 03:01 wwn-0x5000cca273c21a05-part1 -> ../../sdj1
lrwxrwxrwx 1 root root 10 May 23 03:01 wwn-0x5000cca273c21a05-part9 -> ../../sdj9
lrwxrwxrwx 1 root root 9 May 23 00:35 wwn-0x5000cca273ee8907 -> ../../sde
lrwxrwxrwx 1 root root 10 May 23 00:35 wwn-0x5000cca273ee8907-part1 -> ../../sde1
lrwxrwxrwx 1 root root 10 May 23 00:35 wwn-0x5000cca273ee8907-part9 -> ../../sde9
lrwxrwxrwx 1 root root 9 May 23 00:04 wwn-0x5000cca273eeaed7 -> ../../sdh
lrwxrwxrwx 1 root root 10 May 23 00:04 wwn-0x5000cca273eeaed7-part1 -> ../../sdh1
lrwxrwxrwx 1 root root 10 May 23 00:04 wwn-0x5000cca273eeaed7-part9 -> ../../sdh9
lrwxrwxrwx 1 root root 9 May 22 23:19 wwn-0x5002538d40f8ba4c -> ../../sda
lrwxrwxrwx 1 root root 10 May 22 23:19 wwn-0x5002538d40f8ba4c-part1 -> ../../sda1
因此这引发了许多问题:
1) 为什么我的池消失了?是因为符号链接/dev/disk/by-id/
消失了,而 zfs 无法找到大多数磁盘吗?
2) 校验和错误是否令人担忧?磁盘看起来完全正常。我刚刚查看了几个目录和文件,同时将池与引用一起导入sdX
,如果 zfs 以错误的顺序导入磁盘,这是否会导致校验和被错误地重写?
3) 如何找回丢失的/dev/disk/by-id/ata-*
符号链接?Ubuntu 20.04 是否发生了一些变化,导致它们消失?
4) 我认为通过 引用我的磁盘是个好主意/dev/disk/by-id/
,因为这些磁盘会比较稳定。这难道不是最好的方法吗?
5) 我不喜欢这些wwn-*
名称,因为它们对我来说没有描述性。我更希望使用能够反映磁盘序列号的名称,这样如果需要更换磁盘,我可以轻松识别它们。我已经按照 中的建议在 中设置了别名/dev/disk/by-vdev/
(别名为)。wwn-*
http://kbdone.com/zfs-basics/#Consistent_device_IDs_via_vdev_idconf_file:
$ cat /etc/zfs/vdev_id.conf
alias ST8000VN0022-2EL_ZA17H5D3 /dev/disk/by-id/wwn-0x5000c500a2edebe0
alias ST8000VN0022-2EL_ZA17FZXF /dev/disk/by-id/wwn-0x5000c500a2e631c6
alias WD100EFAX-68_2YG1R7PD /dev/disk/by-id/wwn-0x5000cca273c0c7e3
alias WD100EFAX-68_2YK9BHKD /dev/disk/by-id/wwn-0x5000cca273ee8907
...
有什么想法吗?
谢谢!
编辑:zpool status
清理完成后的输出:
root@cloud:~# zpool status
pool: downloadpool
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(5) for details.
scan: scrub repaired 0B in 0 days 11:33:18 with 0 errors on Sun May 10 11:57:19 2020
config:
NAME STATE READ WRITE CKSUM
downloadpool ONLINE 0 0 0
ata-WDC_WD100EFAX-68LHPN0_2YG1R7PD ONLINE 0 0 0
errors: No known data errors
pool: masterpool
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-9P
scan: scrub repaired 112K in 0 days 15:06:09 with 0 errors on Sat May 23 12:53:43 2020
config:
NAME STATE READ WRITE CKSUM
masterpool DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
wwn-0x5000cca26af27d8b DEGRADED 0 0 15 too many errors
wwn-0x5000cca273ee8907 ONLINE 0 0 0
mirror-1 DEGRADED 0 0 0
wwn-0x5000cca26aeb9280 DEGRADED 0 0 18 too many errors
wwn-0x5000cca273eeaed7 ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
wwn-0x5000cca273c21a05 ONLINE 0 0 0
wwn-0x5000cca267eaa17a ONLINE 0 0 0
mirror-3 ONLINE 0 0 0
wwn-0x5000cca26af7e655 ONLINE 0 0 0
wwn-0x5000cca273c099dd ONLINE 0 0 0
mirror-4 ONLINE 0 0 0
ata-ST8000VN0022-2EL112_ZA17FZXF ONLINE 0 0 0
ata-ST8000VN0022-2EL112_ZA17H5D3 ONLINE 0 0 0
errors: No known data errors
答案1
我遇到了完全相同的问题。您的帖子帮助我找到了正确的方向。以下是我的想法。
我有 6 个驱动器,2 个驱动器位于 zfs 池“A”中,连接到主板的 SATA 控制器,4 个驱动器位于 zfs 池“B”中,连接到我的 LSI SAS 9211 控制器。这些池设置为在 /dev/disk/by-id 中查找设备。
从 Ubuntu 18.04 升级到 Ubuntu 20.04 后,连接到 SAS 控制器的所有磁盘的设备 ID 都发生了变化,从设备 ID ata-* 变为 scsi-SATA*。重新启动服务器后,zfs 池 B 丢失,因为 zfs 在导入期间找不到设备 ID。连接到主板上的 SATA 控制器的驱动器的设备 ID 保持不变。使用这些驱动器的 zfs 池可以导入,并且在版本升级后不会丢失。
这是我修复缺失的“B”池的方法:
首先我列出了所有可供导入的池:
sudo zpool import
这列出了我丢失的池“B”,以及该池中的所有正确驱动器,但名称为 /dev 中列出的设备。因此,我使用 /dev/disk/by-id 中列出的设备 ID 导入了该池。我收到警告,称该池似乎可能处于活动状态,因此我不得不使用 -f 强制导入,如下所示:
sudo zpool import -f -d /dev/disk/by-id B
一切又恢复正常了。池 B 又可用了。我没有导出池。我没有导入池,因为我没有先告知使用设备 ID。现在使用的设备 ID 不同了:wwn-*
我对池进行了清理,没有出现任何错误。
回答您的问题:
我认为从 Ubuntu 18.04 到 20.04 的版本升级导致 /dev/disk/by-id 中的链接发生变化。
我没有使用 /dev 引用导入池,而是使用选项 -f 导入。这就是你和我所做的区别。但我无法想象这会是个问题,除非使用了错误的驱动器。
我没有通过 ID 链接找回旧磁盘。但是通过使用磁盘 ID 的指令导入池,它使用新的磁盘 ID,这对我来说已经足够了。我不需要找回旧的。
我仍然认为通过 /dev/disk/by-id/ 引用磁盘是个好主意。这些在重新启动期间以及磁盘在服务器中物理移动时都很稳定(我对此进行了测试)。我有点失望,因为版本升级会破坏磁盘 ID 命名。但我很高兴在我的情况下可以通过再次导入池来解决这个问题。
我也是同样的道理。谢谢你提供的使用别名的提示!也许我会用这个。