zpool status 显示驱动器已默认,但它们似乎没问题。我可以将它们重新添加吗?

zpool status 显示驱动器已默认,但它们似乎没问题。我可以将它们重新添加吗?

zpool status报告了默认驱动器。但实际上它们看起来没问题。是否可以将它们重新添加?

$ dev/disk# zpool status -v
  pool: darkpool
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
    invalid.  Sufficient replicas exist for the pool to continue
    functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: scrub in progress since Fri Nov  8 04:52:09 2019
    1004G scanned out of 47.5T at 81.4M/s, 166h22m to go
    0B repaired, 2.06% done
config:

    NAME                          STATE     READ WRITE CKSUM
    darkpool                      DEGRADED     0     0     0
      raidz3-0                    DEGRADED     0     0     0
        wwn-0x5000c5008581aafb    ONLINE       0     0     0
        wwn-0x5000c5008581b61b    ONLINE       0     0     0
        783034318520267027        FAULTED      0     0     0  was /dev/sdm1
        7369503050985789936       FAULTED      0     0     0  was /dev/sdj1
        wwn-0x5000c5008581b953    ONLINE       0     0     0
        wwn-0x5000c5008581bdf7    ONLINE       0     0     0
        wwn-0x5000c50085825ec7    ONLINE       0     0     0
        11744243917579175290      FAULTED      0     0     0  was /dev/sdg1
        wwn-0x5000c5008581e423    ONLINE       0     0     0
        wwn-0x5000c5008581fd3f    ONLINE       0     0     0
        wwn-0x5000c50085820b93    ONLINE       0     0     0
        wwn-0x5000c500858211b3    ONLINE       0     0     0
        wwn-0x5000cca267ab0de4    ONLINE       0     0     0
        spare-13                  DEGRADED     0     0     0
          11992420879588183985    FAULTED      0     0     0  was /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:10:0-part1
          wwn-0x5000c500858252ef  ONLINE       0     0     0
    spares
      wwn-0x5000c500858252ef      INUSE     currently in use

故障驱动器看上去没问题

$ sudo smartctl --all /dev/sdm1
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-66-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SEAGATE
Product:              ST8000NM0075
Revision:             PS24
Compliance:           SPC-4
User Capacity:        8,001,563,222,016 bytes [8.00 TB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
Formatted with type 2 protection
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000c50085820b93
Serial number:        ZA12CVG1
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Fri Nov  8 10:26:20 2019 EST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     58 C
Drive Trip Temperature:        60 C

Manufactured in week 23 of year 2016
Specified cycle count over device lifetime:  10000
Accumulated start-stop cycles:  148
Specified load-unload count over device lifetime:  300000
Accumulated load-unload cycles:  1344
Elements in grown defect list: 0

Vendor (Seagate) cache information
  Blocks sent to initiator = 2633993520
  Blocks received from initiator = 313335416
  Blocks read from cache and sent to initiator = 3189766298
  Number of read and write commands whose size <= segment size = 373006550
  Number of read and write commands whose size > segment size = 142985

Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 28987.73
  number of minutes until next internal SMART test = 48

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   574211145      105         0  574211250        105     242574.514           0
write:         0        0        17        17         17      18073.098           0
verify:   252916        0         0    252916          0          0.526           0

Non-medium error count:     1269

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background short  Completed                  96       4                 - [-   -    -]
# 2  Reserved(7)       Completed                  64       4                 - [-   -    -]

Long (extended) Self Test duration: 47220 seconds [787.0 minutes]

$ sudo smartctl --all /dev/sdj1
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-66-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SEAGATE
Product:              ST8000NM0075
Revision:             PS24
Compliance:           SPC-4
User Capacity:        8,001,563,222,016 bytes [8.00 TB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
Formatted with type 2 protection
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000c50085823d2b
Serial number:        ZA12BNXA
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Fri Nov  8 10:26:24 2019 EST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     47 C
Drive Trip Temperature:        60 C

Manufactured in week 23 of year 2016
Specified cycle count over device lifetime:  10000
Accumulated start-stop cycles:  148
Specified load-unload count over device lifetime:  300000
Accumulated load-unload cycles:  1364
Elements in grown defect list: 0

Vendor (Seagate) cache information
  Blocks sent to initiator = 4179446744
  Blocks received from initiator = 2703674280
  Blocks read from cache and sent to initiator = 2799660441
  Number of read and write commands whose size <= segment size = 334518430
  Number of read and write commands whose size > segment size = 131599

Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 28987.73
  number of minutes until next internal SMART test = 43

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   4216128253        9         0  4216128262          9     214344.135           0
write:         0        0         4         4          4      17073.614           0
verify:   269974        0         0    269974          0          0.562           0

Non-medium error count:      570

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background short  Completed                  96       4                 - [-   -    -]
# 2  Reserved(7)       Completed                  64       4                 - [-   -    -]

Long (extended) Self Test duration: 47220 seconds [787.0 minutes]

$ sudo smartctl --all /dev/sdg1
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-66-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SEAGATE
Product:              ST8000NM0075
Revision:             PS24
Compliance:           SPC-4
User Capacity:        8,001,563,222,016 bytes [8.00 TB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
Formatted with type 2 protection
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000c5008581aafb
Serial number:        ZA12CXW2
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Fri Nov  8 10:26:28 2019 EST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     59 C
Drive Trip Temperature:        60 C

Manufactured in week 23 of year 2016
Specified cycle count over device lifetime:  10000
Accumulated start-stop cycles:  148
Specified load-unload count over device lifetime:  300000
Accumulated load-unload cycles:  1334
Elements in grown defect list: 0

Vendor (Seagate) cache information
  Blocks sent to initiator = 2845390680
  Blocks received from initiator = 1453787448
  Blocks read from cache and sent to initiator = 3178782010
  Number of read and write commands whose size <= segment size = 376760133
  Number of read and write commands whose size > segment size = 148599

Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 28987.77
  number of minutes until next internal SMART test = 39

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   704945336        2         0  704945338          2     244917.683           0
write:         0        0        73        73         73      18665.495           0
verify:   320880        0         0    320880          0          0.667           0

Non-medium error count:     1242

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background short  Completed                  96       4                 - [-   -    -]
# 2  Reserved(7)       Completed                  64       4                 - [-   -    -]

Long (extended) Self Test duration: 47220 seconds [787.0 minutes

他们都在这里

当前的: 


sda     wwn-0x5000c500858211b3  
sdb     wwn-0x5000c5008581b953  
sdc     wwn-0x5000c50085825ec7  
sdd     wwn-0x5000c5008581e423  
sdf     wwn-0x5000c5008581b61b  
sdg     wwn-0x5000c5008581aafb  *
sdh     wwn-0x5000c5008581cc03  *
sdi     wwn-0x5000cca267ab0de4      
sdk     wwn-0x5000c5008581b933  *
sdl     wwn-0x5000c5008581bdf7  *
sdm     wwn-0x5000c50085820b93  *
sdn     wwn-0x5000c5008581b79f  *
sdo     wwn-0x5000c500858252ef  *
sdp     wwn-0x5000c5008581fd3f  
sdq     wnn-0x61866da05f3bc2001f1c1a0d117e72cf

答案1

内核环形缓冲区里有什么?你能发布相关的片段吗dmesg -T

尝试zpool clear清除瞬态错误。

这些都是 SAS 磁盘吗?或者您的环境中是否混入了 SATA?


编辑 SATA 驱动器的设备超时由于某些原因

echo 180 > /sys/block/sdX/device/timeoutsdX设备在哪里。

然后运行zpool clear并查看一切是否正常恢复。

相关内容