升级至 Ubuntu 20.04 后池子消失

升级至 Ubuntu 20.04 后池子消失

我将服务器 (SuperMicro X11-SSM-F,LSI SAS 9211-8i) 从 Ubuntu 18.04 升级到 20.04。该服务器有 2 个 zpool,一个由单个 WD Red 10 TB(下载池)组成,另一个由 8 个 WD Red 10TB 和 2 个 Seagate IronWolf 8TB 组成,排列成 5x2 个镜像(主池)。池是使用/dev/disk/by-id引用创建的,以便在重新启动时保持稳定。池会定期清理,最后一次清理是在几周前,没有显示任何错误。

当我更新到 Ubuntu 20.04 后重新启动时,第二个池(masterpool)消失了。运行后zfs import,它重新导入了它,但使用了sdX大多数磁盘的引用(WD Reds,但不是 Seagates)。此外,具有单个 WD Red 的池很好,并且引用了其磁盘 by-id。masterpool 的输出zpool status看起来像这样(这是来自内存):

    NAME                                  STATE     READ WRITE CKSUM
    masterpool                            ONLINE       0     0     0
      mirror-0                            ONLINE       0     0     0
        sdb                               ONLINE       0     0     0
        sdk                               ONLINE       0     0     0
      mirror-1                            ONLINE       0     0     0
        sdi                               ONLINE       0     0     0
        sdf                               ONLINE       0     0     0
      mirror-2                            ONLINE       0     0     0
        sdd                               ONLINE       0     0     0
        sde                               ONLINE       0     0     0
      mirror-3                            ONLINE       0     0     0
        sdh                               ONLINE       0     0     0
        sdc                               ONLINE       0     0     0
      mirror-4                            ONLINE       0     0     0
        ata-ST8000VN0022-2EL112_ZA17FZXF  ONLINE       0     0     0
        ata-ST8000VN0022-2EL112_ZA17H5D3  ONLINE       0     0     0

这并不理想,因为这些标识符不稳定,所以在网上查看了一下之后,我重新导出了池,然后运行zpool import -d /dev/disk/by-id masterpool

但是现在,zpool 告诉我存在校验和错误:

    NAME                                  STATE     READ WRITE CKSUM
    masterpool                            ONLINE       0     0     0
      mirror-0                            ONLINE       0     0     0
        wwn-0x5000cca26af27d8b            ONLINE       0     0     2
        wwn-0x5000cca273ee8907            ONLINE       0     0     0
      mirror-1                            ONLINE       0     0     0
        wwn-0x5000cca26aeb9280            ONLINE       0     0     8
        wwn-0x5000cca273eeaed7            ONLINE       0     0     0
      mirror-2                            ONLINE       0     0     0
        wwn-0x5000cca273c21a05            ONLINE       0     0     0
        wwn-0x5000cca267eaa17a            ONLINE       0     0     0
      mirror-3                            ONLINE       0     0     0
        wwn-0x5000cca26af7e655            ONLINE       0     0     0
        wwn-0x5000cca273c099dd            ONLINE       0     0     0
      mirror-4                            ONLINE       0     0     0
        ata-ST8000VN0022-2EL112_ZA17FZXF  ONLINE       0     0     0
        ata-ST8000VN0022-2EL112_ZA17H5D3  ONLINE       0     0     0

因此,我正在运行清理,并且 zfs 发现了更多校验和错误:

  pool: masterpool
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: scrub in progress since Fri May 22 21:47:34 2020
        27.1T scanned at 600M/s, 27.0T issued at 597M/s, 31.1T total
        112K repaired, 86.73% done, 0 days 02:00:45 to go
config:

        NAME                                  STATE     READ WRITE CKSUM
        masterpool                            DEGRADED     0     0     0
          mirror-0                            DEGRADED     0     0     0
            wwn-0x5000cca26af27d8b            DEGRADED     0     0    15  too many errors  (repairing)
            wwn-0x5000cca273ee8907            ONLINE       0     0     0
          mirror-1                            DEGRADED     0     0     0
            wwn-0x5000cca26aeb9280            DEGRADED     0     0    18  too many errors  (repairing)
            wwn-0x5000cca273eeaed7            ONLINE       0     0     0
          mirror-2                            ONLINE       0     0     0
            wwn-0x5000cca273c21a05            ONLINE       0     0     0
            wwn-0x5000cca267eaa17a            ONLINE       0     0     0
          mirror-3                            ONLINE       0     0     0
            wwn-0x5000cca26af7e655            ONLINE       0     0     0
            wwn-0x5000cca273c099dd            ONLINE       0     0     0
          mirror-4                            ONLINE       0     0     0
            ata-ST8000VN0022-2EL112_ZA17FZXF  ONLINE       0     0     0
            ata-ST8000VN0022-2EL112_ZA17H5D3  ONLINE       0     0     0

奇怪的是,smartctl 没有显示智能监控数据中的任何异常(两个磁盘的输出类似,只是显示一个):

$ sudo smartctl /dev/disk/by-id/wwn-0x5000cca26aeb9280 -a
...
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0004   129   129   054    Old_age   Offline      -       112
  3 Spin_Up_Time            0x0007   153   153   024    Pre-fail  Always       -       431 (Average 430)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       31
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000a   100   100   067    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0004   128   128   020    Old_age   Offline      -       18
  9 Power_On_Hours          0x0012   098   098   000    Old_age   Always       -       15474
 10 Spin_Retry_Count        0x0012   100   100   060    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       31
 22 Helium_Level            0x0023   100   100   025    Pre-fail  Always       -       100
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       664
193 Load_Cycle_Count        0x0012   100   100   000    Old_age   Always       -       664
194 Temperature_Celsius     0x0002   158   158   000    Old_age   Always       -       41 (Min/Max 16/41)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%        19         -
# 2  Short offline       Completed without error       00%         0         -

...

另外,我注意到中的许多别名/dev/disk/by-id都消失了(ata-*除 cloudpool 中的唯一别名外,WD Reds 的所有别名都消失了):

# ls /dev/disk/by-id/ -l
total 0
lrwxrwxrwx 1 root root  9 May 22 23:19 ata-Samsung_SSD_850_EVO_500GB_S2RANX0H608885H -> ../../sda
lrwxrwxrwx 1 root root 10 May 22 23:19 ata-Samsung_SSD_850_EVO_500GB_S2RANX0H608885H-part1 -> ../../sda1
lrwxrwxrwx 1 root root  9 May 23 01:28 ata-ST8000VN0022-2EL112_ZA17FZXF -> ../../sdc
lrwxrwxrwx 1 root root 10 May 23 01:28 ata-ST8000VN0022-2EL112_ZA17FZXF-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 May 23 01:28 ata-ST8000VN0022-2EL112_ZA17FZXF-part9 -> ../../sdc9
lrwxrwxrwx 1 root root  9 May 23 01:16 ata-ST8000VN0022-2EL112_ZA17H5D3 -> ../../sdb
lrwxrwxrwx 1 root root 10 May 23 01:16 ata-ST8000VN0022-2EL112_ZA17H5D3-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 May 23 01:16 ata-ST8000VN0022-2EL112_ZA17H5D3-part9 -> ../../sdb9
lrwxrwxrwx 1 root root  9 May 22 23:21 ata-WDC_WD100EFAX-68LHPN0_2YG1R7PD -> ../../sdd
lrwxrwxrwx 1 root root 10 May 22 23:21 ata-WDC_WD100EFAX-68LHPN0_2YG1R7PD-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 May 22 23:21 ata-WDC_WD100EFAX-68LHPN0_2YG1R7PD-part9 -> ../../sdd9
lrwxrwxrwx 1 root root  9 May 22 23:19 scsi-0ATA_Samsung_SSD_850_S2RANX0H608885H -> ../../sda
lrwxrwxrwx 1 root root 10 May 22 23:19 scsi-0ATA_Samsung_SSD_850_S2RANX0H608885H-part1 -> ../../sda1
lrwxrwxrwx 1 root root  9 May 23 01:28 scsi-0ATA_ST8000VN0022-2EL_ZA17FZXF -> ../../sdc
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-0ATA_ST8000VN0022-2EL_ZA17FZXF-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-0ATA_ST8000VN0022-2EL_ZA17FZXF-part9 -> ../../sdc9
lrwxrwxrwx 1 root root  9 May 23 01:16 scsi-0ATA_ST8000VN0022-2EL_ZA17H5D3 -> ../../sdb
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-0ATA_ST8000VN0022-2EL_ZA17H5D3-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-0ATA_ST8000VN0022-2EL_ZA17H5D3-part9 -> ../../sdb9
lrwxrwxrwx 1 root root  9 May 22 23:21 scsi-0ATA_WDC_WD100EFAX-68_2YG1R7PD -> ../../sdd
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-0ATA_WDC_WD100EFAX-68_2YG1R7PD-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-0ATA_WDC_WD100EFAX-68_2YG1R7PD-part9 -> ../../sdd9
lrwxrwxrwx 1 root root  9 May 22 23:19 scsi-1ATA_Samsung_SSD_850_EVO_500GB_S2RANX0H608885H -> ../../sda
lrwxrwxrwx 1 root root 10 May 22 23:19 scsi-1ATA_Samsung_SSD_850_EVO_500GB_S2RANX0H608885H-part1 -> ../../sda1
lrwxrwxrwx 1 root root  9 May 23 01:28 scsi-1ATA_ST8000VN0022-2EL112_ZA17FZXF -> ../../sdc
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-1ATA_ST8000VN0022-2EL112_ZA17FZXF-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-1ATA_ST8000VN0022-2EL112_ZA17FZXF-part9 -> ../../sdc9
lrwxrwxrwx 1 root root  9 May 23 01:16 scsi-1ATA_ST8000VN0022-2EL112_ZA17H5D3 -> ../../sdb
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-1ATA_ST8000VN0022-2EL112_ZA17H5D3-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-1ATA_ST8000VN0022-2EL112_ZA17H5D3-part9 -> ../../sdb9
lrwxrwxrwx 1 root root  9 May 22 23:21 scsi-1ATA_WDC_WD100EFAX-68LHPN0_2YG1R7PD -> ../../sdd
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-1ATA_WDC_WD100EFAX-68LHPN0_2YG1R7PD-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-1ATA_WDC_WD100EFAX-68LHPN0_2YG1R7PD-part9 -> ../../sdd9
lrwxrwxrwx 1 root root  9 May 23 01:28 scsi-35000c500a2e631c6 -> ../../sdc
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-35000c500a2e631c6-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-35000c500a2e631c6-part9 -> ../../sdc9
lrwxrwxrwx 1 root root  9 May 23 01:16 scsi-35000c500a2edebe0 -> ../../sdb
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-35000c500a2edebe0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-35000c500a2edebe0-part9 -> ../../sdb9
lrwxrwxrwx 1 root root  9 May 23 00:38 scsi-35000cca267eaa17a -> ../../sdg
lrwxrwxrwx 1 root root 10 May 23 00:38 scsi-35000cca267eaa17a-part1 -> ../../sdg1
lrwxrwxrwx 1 root root 10 May 23 00:38 scsi-35000cca267eaa17a-part9 -> ../../sdg9
lrwxrwxrwx 1 root root  9 May 23 01:20 scsi-35000cca26aeb9280 -> ../../sdl
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-35000cca26aeb9280-part1 -> ../../sdl1
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-35000cca26aeb9280-part9 -> ../../sdl9
lrwxrwxrwx 1 root root  9 May 23 01:20 scsi-35000cca26af27d8b -> ../../sdk
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-35000cca26af27d8b-part1 -> ../../sdk1
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-35000cca26af27d8b-part9 -> ../../sdk9
lrwxrwxrwx 1 root root  9 May 23 02:35 scsi-35000cca26af7e655 -> ../../sdi
lrwxrwxrwx 1 root root 10 May 23 02:35 scsi-35000cca26af7e655-part1 -> ../../sdi1
lrwxrwxrwx 1 root root 10 May 23 02:35 scsi-35000cca26af7e655-part9 -> ../../sdi9
lrwxrwxrwx 1 root root  9 May 23 00:35 scsi-35000cca273c099dd -> ../../sdf
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-35000cca273c099dd-part1 -> ../../sdf1
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-35000cca273c099dd-part9 -> ../../sdf9
lrwxrwxrwx 1 root root  9 May 22 23:21 scsi-35000cca273c0c7e3 -> ../../sdd
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-35000cca273c0c7e3-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-35000cca273c0c7e3-part9 -> ../../sdd9
lrwxrwxrwx 1 root root  9 May 23 03:01 scsi-35000cca273c21a05 -> ../../sdj
lrwxrwxrwx 1 root root 10 May 23 03:01 scsi-35000cca273c21a05-part1 -> ../../sdj1
lrwxrwxrwx 1 root root 10 May 23 03:01 scsi-35000cca273c21a05-part9 -> ../../sdj9
lrwxrwxrwx 1 root root  9 May 23 00:35 scsi-35000cca273ee8907 -> ../../sde
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-35000cca273ee8907-part1 -> ../../sde1
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-35000cca273ee8907-part9 -> ../../sde9
lrwxrwxrwx 1 root root  9 May 23 00:04 scsi-35000cca273eeaed7 -> ../../sdh
lrwxrwxrwx 1 root root 10 May 23 00:04 scsi-35000cca273eeaed7-part1 -> ../../sdh1
lrwxrwxrwx 1 root root 10 May 23 00:04 scsi-35000cca273eeaed7-part9 -> ../../sdh9
lrwxrwxrwx 1 root root  9 May 22 23:19 scsi-35002538d40f8ba4c -> ../../sda
lrwxrwxrwx 1 root root 10 May 22 23:19 scsi-35002538d40f8ba4c-part1 -> ../../sda1
lrwxrwxrwx 1 root root  9 May 22 23:19 scsi-SATA_Samsung_SSD_850_S2RANX0H608885H -> ../../sda
lrwxrwxrwx 1 root root 10 May 22 23:19 scsi-SATA_Samsung_SSD_850_S2RANX0H608885H-part1 -> ../../sda1
lrwxrwxrwx 1 root root  9 May 23 01:28 scsi-SATA_ST8000VN0022-2EL_ZA17FZXF -> ../../sdc
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-SATA_ST8000VN0022-2EL_ZA17FZXF-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 May 23 01:28 scsi-SATA_ST8000VN0022-2EL_ZA17FZXF-part9 -> ../../sdc9
lrwxrwxrwx 1 root root  9 May 23 01:16 scsi-SATA_ST8000VN0022-2EL_ZA17H5D3 -> ../../sdb
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-SATA_ST8000VN0022-2EL_ZA17H5D3-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 May 23 01:16 scsi-SATA_ST8000VN0022-2EL_ZA17H5D3-part9 -> ../../sdb9
lrwxrwxrwx 1 root root  9 May 23 01:20 scsi-SATA_WDC_WD100EFAX-68_2TK2VELD -> ../../sdl
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-SATA_WDC_WD100EFAX-68_2TK2VELD-part1 -> ../../sdl1
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-SATA_WDC_WD100EFAX-68_2TK2VELD-part9 -> ../../sdl9
lrwxrwxrwx 1 root root  9 May 23 01:20 scsi-SATA_WDC_WD100EFAX-68_2TKL26ZD -> ../../sdk
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-SATA_WDC_WD100EFAX-68_2TKL26ZD-part1 -> ../../sdk1
lrwxrwxrwx 1 root root 10 May 23 01:20 scsi-SATA_WDC_WD100EFAX-68_2TKL26ZD-part9 -> ../../sdk9
lrwxrwxrwx 1 root root  9 May 23 02:35 scsi-SATA_WDC_WD100EFAX-68_2TKYZ3ND -> ../../sdi
lrwxrwxrwx 1 root root 10 May 23 02:35 scsi-SATA_WDC_WD100EFAX-68_2TKYZ3ND-part1 -> ../../sdi1
lrwxrwxrwx 1 root root 10 May 23 02:35 scsi-SATA_WDC_WD100EFAX-68_2TKYZ3ND-part9 -> ../../sdi9
lrwxrwxrwx 1 root root  9 May 23 00:35 scsi-SATA_WDC_WD100EFAX-68_2YG19ZMD -> ../../sdf
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-SATA_WDC_WD100EFAX-68_2YG19ZMD-part1 -> ../../sdf1
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-SATA_WDC_WD100EFAX-68_2YG19ZMD-part9 -> ../../sdf9
lrwxrwxrwx 1 root root  9 May 22 23:21 scsi-SATA_WDC_WD100EFAX-68_2YG1R7PD -> ../../sdd
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-SATA_WDC_WD100EFAX-68_2YG1R7PD-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 May 22 23:21 scsi-SATA_WDC_WD100EFAX-68_2YG1R7PD-part9 -> ../../sdd9
lrwxrwxrwx 1 root root  9 May 23 03:01 scsi-SATA_WDC_WD100EFAX-68_2YG4MA0D -> ../../sdj
lrwxrwxrwx 1 root root 10 May 23 03:01 scsi-SATA_WDC_WD100EFAX-68_2YG4MA0D-part1 -> ../../sdj1
lrwxrwxrwx 1 root root 10 May 23 03:01 scsi-SATA_WDC_WD100EFAX-68_2YG4MA0D-part9 -> ../../sdj9
lrwxrwxrwx 1 root root  9 May 23 00:35 scsi-SATA_WDC_WD100EFAX-68_2YK9BHKD -> ../../sde
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-SATA_WDC_WD100EFAX-68_2YK9BHKD-part1 -> ../../sde1
lrwxrwxrwx 1 root root 10 May 23 00:35 scsi-SATA_WDC_WD100EFAX-68_2YK9BHKD-part9 -> ../../sde9
lrwxrwxrwx 1 root root  9 May 23 00:04 scsi-SATA_WDC_WD100EFAX-68_2YK9PKUD -> ../../sdh
lrwxrwxrwx 1 root root 10 May 23 00:04 scsi-SATA_WDC_WD100EFAX-68_2YK9PKUD-part1 -> ../../sdh1
lrwxrwxrwx 1 root root 10 May 23 00:04 scsi-SATA_WDC_WD100EFAX-68_2YK9PKUD-part9 -> ../../sdh9
lrwxrwxrwx 1 root root  9 May 23 00:38 scsi-SATA_WDC_WD100EFAX-68_JEK0T76Z -> ../../sdg
lrwxrwxrwx 1 root root 10 May 23 00:38 scsi-SATA_WDC_WD100EFAX-68_JEK0T76Z-part1 -> ../../sdg1
lrwxrwxrwx 1 root root 10 May 23 00:38 scsi-SATA_WDC_WD100EFAX-68_JEK0T76Z-part9 -> ../../sdg9
lrwxrwxrwx 1 root root  9 May 23 01:28 wwn-0x5000c500a2e631c6 -> ../../sdc
lrwxrwxrwx 1 root root 10 May 23 01:28 wwn-0x5000c500a2e631c6-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 May 23 01:28 wwn-0x5000c500a2e631c6-part9 -> ../../sdc9
lrwxrwxrwx 1 root root  9 May 23 01:16 wwn-0x5000c500a2edebe0 -> ../../sdb
lrwxrwxrwx 1 root root 10 May 23 01:16 wwn-0x5000c500a2edebe0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 May 23 01:16 wwn-0x5000c500a2edebe0-part9 -> ../../sdb9
lrwxrwxrwx 1 root root  9 May 23 00:38 wwn-0x5000cca267eaa17a -> ../../sdg
lrwxrwxrwx 1 root root 10 May 23 00:38 wwn-0x5000cca267eaa17a-part1 -> ../../sdg1
lrwxrwxrwx 1 root root 10 May 23 00:38 wwn-0x5000cca267eaa17a-part9 -> ../../sdg9
lrwxrwxrwx 1 root root  9 May 23 01:20 wwn-0x5000cca26aeb9280 -> ../../sdl
lrwxrwxrwx 1 root root 10 May 23 01:20 wwn-0x5000cca26aeb9280-part1 -> ../../sdl1
lrwxrwxrwx 1 root root 10 May 23 01:20 wwn-0x5000cca26aeb9280-part9 -> ../../sdl9
lrwxrwxrwx 1 root root  9 May 23 01:20 wwn-0x5000cca26af27d8b -> ../../sdk
lrwxrwxrwx 1 root root 10 May 23 01:20 wwn-0x5000cca26af27d8b-part1 -> ../../sdk1
lrwxrwxrwx 1 root root 10 May 23 01:20 wwn-0x5000cca26af27d8b-part9 -> ../../sdk9
lrwxrwxrwx 1 root root  9 May 23 02:35 wwn-0x5000cca26af7e655 -> ../../sdi
lrwxrwxrwx 1 root root 10 May 23 02:35 wwn-0x5000cca26af7e655-part1 -> ../../sdi1
lrwxrwxrwx 1 root root 10 May 23 02:35 wwn-0x5000cca26af7e655-part9 -> ../../sdi9
lrwxrwxrwx 1 root root  9 May 23 00:35 wwn-0x5000cca273c099dd -> ../../sdf
lrwxrwxrwx 1 root root 10 May 23 00:35 wwn-0x5000cca273c099dd-part1 -> ../../sdf1
lrwxrwxrwx 1 root root 10 May 23 00:35 wwn-0x5000cca273c099dd-part9 -> ../../sdf9
lrwxrwxrwx 1 root root  9 May 22 23:21 wwn-0x5000cca273c0c7e3 -> ../../sdd
lrwxrwxrwx 1 root root 10 May 22 23:21 wwn-0x5000cca273c0c7e3-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 May 22 23:21 wwn-0x5000cca273c0c7e3-part9 -> ../../sdd9
lrwxrwxrwx 1 root root  9 May 23 03:01 wwn-0x5000cca273c21a05 -> ../../sdj
lrwxrwxrwx 1 root root 10 May 23 03:01 wwn-0x5000cca273c21a05-part1 -> ../../sdj1
lrwxrwxrwx 1 root root 10 May 23 03:01 wwn-0x5000cca273c21a05-part9 -> ../../sdj9
lrwxrwxrwx 1 root root  9 May 23 00:35 wwn-0x5000cca273ee8907 -> ../../sde
lrwxrwxrwx 1 root root 10 May 23 00:35 wwn-0x5000cca273ee8907-part1 -> ../../sde1
lrwxrwxrwx 1 root root 10 May 23 00:35 wwn-0x5000cca273ee8907-part9 -> ../../sde9
lrwxrwxrwx 1 root root  9 May 23 00:04 wwn-0x5000cca273eeaed7 -> ../../sdh
lrwxrwxrwx 1 root root 10 May 23 00:04 wwn-0x5000cca273eeaed7-part1 -> ../../sdh1
lrwxrwxrwx 1 root root 10 May 23 00:04 wwn-0x5000cca273eeaed7-part9 -> ../../sdh9
lrwxrwxrwx 1 root root  9 May 22 23:19 wwn-0x5002538d40f8ba4c -> ../../sda
lrwxrwxrwx 1 root root 10 May 22 23:19 wwn-0x5002538d40f8ba4c-part1 -> ../../sda1

因此这引发了许多问题:

1) 为什么我的池消失了?是因为符号链接/dev/disk/by-id/消失了,而 zfs 无法找到大多数磁盘吗?

2) 校验和错误是否令人担忧?磁盘看起来完全正常。我刚刚查看了几个目录和文件,同时将池与引用一起导入sdX,如果 zfs 以错误的顺序导入磁盘,这是否会导致校验和被错误地重写?

3) 如何找回丢失的/dev/disk/by-id/ata-*符号链接?Ubuntu 20.04 是否发生了一些变化,导致它们消失?

4) 我认为通过 引用我的磁盘是个好主意/dev/disk/by-id/,因为这些磁盘会比较稳定。这难道不是最好的方法吗?

5) 我不喜欢这些wwn-*名称,因为它们对我来说没有描述性。我更希望使用能够反映磁盘序列号的名称,这样如果需要更换磁盘,我可以轻松识别它们。我已经按照 中的建议在 中设置了别名/dev/disk/by-vdev/(别名为)。wwn-*http://kbdone.com/zfs-basics/#Consistent_device_IDs_via_vdev_idconf_file

$ cat /etc/zfs/vdev_id.conf
alias ST8000VN0022-2EL_ZA17H5D3 /dev/disk/by-id/wwn-0x5000c500a2edebe0
alias ST8000VN0022-2EL_ZA17FZXF /dev/disk/by-id/wwn-0x5000c500a2e631c6
alias WD100EFAX-68_2YG1R7PD /dev/disk/by-id/wwn-0x5000cca273c0c7e3
alias WD100EFAX-68_2YK9BHKD /dev/disk/by-id/wwn-0x5000cca273ee8907
...

有什么想法吗?

谢谢!

编辑:zpool status清理完成后的输出:

root@cloud:~# zpool status
  pool: downloadpool
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(5) for details.
  scan: scrub repaired 0B in 0 days 11:33:18 with 0 errors on Sun May 10 11:57:19 2020
config:

        NAME                                  STATE     READ WRITE CKSUM
        downloadpool                          ONLINE       0     0     0
          ata-WDC_WD100EFAX-68LHPN0_2YG1R7PD  ONLINE       0     0     0

errors: No known data errors

  pool: masterpool
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: scrub repaired 112K in 0 days 15:06:09 with 0 errors on Sat May 23 12:53:43 2020
config:

        NAME                                  STATE     READ WRITE CKSUM
        masterpool                            DEGRADED     0     0     0
          mirror-0                            DEGRADED     0     0     0
            wwn-0x5000cca26af27d8b            DEGRADED     0     0    15  too many errors
            wwn-0x5000cca273ee8907            ONLINE       0     0     0
          mirror-1                            DEGRADED     0     0     0
            wwn-0x5000cca26aeb9280            DEGRADED     0     0    18  too many errors
            wwn-0x5000cca273eeaed7            ONLINE       0     0     0
          mirror-2                            ONLINE       0     0     0
            wwn-0x5000cca273c21a05            ONLINE       0     0     0
            wwn-0x5000cca267eaa17a            ONLINE       0     0     0
          mirror-3                            ONLINE       0     0     0
            wwn-0x5000cca26af7e655            ONLINE       0     0     0
            wwn-0x5000cca273c099dd            ONLINE       0     0     0
          mirror-4                            ONLINE       0     0     0
            ata-ST8000VN0022-2EL112_ZA17FZXF  ONLINE       0     0     0
            ata-ST8000VN0022-2EL112_ZA17H5D3  ONLINE       0     0     0

errors: No known data errors

答案1

我遇到了完全相同的问题。您的帖子帮助我找到了正确的方向。以下是我的想法。

我有 6 个驱动器,2 个驱动器位于 zfs 池“A”中,连接到主板的 SATA 控制器,4 个驱动器位于 zfs 池“B”中,连接到我的 LSI SAS 9211 控制器。这些池设置为在 /dev/disk/by-id 中查找设备。

从 Ubuntu 18.04 升级到 Ubuntu 20.04 后,连接到 SAS 控制器的所有磁盘的设备 ID 都发生了变化,从设备 ID ata-* 变为 scsi-SATA*。重新启动服务器后,zfs 池 B 丢失,因为 zfs 在导入期间找不到设备 ID。连接到主板上的 SATA 控制器的驱动器的设备 ID 保持不变。使用这些驱动器的 zfs 池可以导入,并且在版本升级后不会丢失。

这是我修复缺失的“B”池的方法:

首先我列出了所有可供导入的池:

sudo zpool import

这列出了我丢失的池“B”,以及该池中的所有正确驱动器,但名称为 /dev 中列出的设备。因此,我使用 /dev/disk/by-id 中列出的设备 ID 导入了该池。我收到警告,称该池似乎可能处于活动状态,因此我不得不使用 -f 强制导入,如下所示:

sudo zpool import -f -d /dev/disk/by-id B

一切又恢复正常了。池 B 又可用了。我没有导出池。我没有导入池,因为我没有先告知使用设备 ID。现在使用的设备 ID 不同了:wwn-*

我对池进行了清理,没有出现任何错误。

回答您的问题:

  1. 我认为从 Ubuntu 18.04 到 20.04 的版本升级导致 /dev/disk/by-id 中的链接发生变化。

  2. 我没有使用 /dev 引用导入池,而是使用选项 -f 导入。这就是你和我所做的区别。但我无法想象这会是个问题,除非使用了错误的驱动器。

  3. 我没有通过 ID 链接找回旧磁盘。但是通过使用磁盘 ID 的指令导入池,它使用新的磁盘 ID,这对我来说已经足够了。我不需要找回旧的。

  4. 我仍然认为通过 /dev/disk/by-id/ 引用磁盘是个好主意。这些在重新启动期间以及磁盘在服务器中物理移动时都很稳定(我对此进行了测试)。我有点失望,因为版本升级会破坏磁盘 ID 命名。但我很高兴在我的情况下可以通过再次导入池来解决这个问题。

  5. 我也是同样的道理。谢谢你提供的使用别名的提示!也许我会用这个。

相关内容