zpool status
报告了默认驱动器。但实际上它们看起来没问题。是否可以将它们重新添加?
$ dev/disk# zpool status -v
pool: darkpool
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-4J
scan: scrub in progress since Fri Nov 8 04:52:09 2019
1004G scanned out of 47.5T at 81.4M/s, 166h22m to go
0B repaired, 2.06% done
config:
NAME STATE READ WRITE CKSUM
darkpool DEGRADED 0 0 0
raidz3-0 DEGRADED 0 0 0
wwn-0x5000c5008581aafb ONLINE 0 0 0
wwn-0x5000c5008581b61b ONLINE 0 0 0
783034318520267027 FAULTED 0 0 0 was /dev/sdm1
7369503050985789936 FAULTED 0 0 0 was /dev/sdj1
wwn-0x5000c5008581b953 ONLINE 0 0 0
wwn-0x5000c5008581bdf7 ONLINE 0 0 0
wwn-0x5000c50085825ec7 ONLINE 0 0 0
11744243917579175290 FAULTED 0 0 0 was /dev/sdg1
wwn-0x5000c5008581e423 ONLINE 0 0 0
wwn-0x5000c5008581fd3f ONLINE 0 0 0
wwn-0x5000c50085820b93 ONLINE 0 0 0
wwn-0x5000c500858211b3 ONLINE 0 0 0
wwn-0x5000cca267ab0de4 ONLINE 0 0 0
spare-13 DEGRADED 0 0 0
11992420879588183985 FAULTED 0 0 0 was /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:10:0-part1
wwn-0x5000c500858252ef ONLINE 0 0 0
spares
wwn-0x5000c500858252ef INUSE currently in use
故障驱动器看上去没问题
$ sudo smartctl --all /dev/sdm1
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-66-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: SEAGATE
Product: ST8000NM0075
Revision: PS24
Compliance: SPC-4
User Capacity: 8,001,563,222,016 bytes [8.00 TB]
Logical block size: 512 bytes
Physical block size: 4096 bytes
Formatted with type 2 protection
LU is fully provisioned
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000c50085820b93
Serial number: ZA12CVG1
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Fri Nov 8 10:26:20 2019 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Disabled or Not Supported
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 58 C
Drive Trip Temperature: 60 C
Manufactured in week 23 of year 2016
Specified cycle count over device lifetime: 10000
Accumulated start-stop cycles: 148
Specified load-unload count over device lifetime: 300000
Accumulated load-unload cycles: 1344
Elements in grown defect list: 0
Vendor (Seagate) cache information
Blocks sent to initiator = 2633993520
Blocks received from initiator = 313335416
Blocks read from cache and sent to initiator = 3189766298
Number of read and write commands whose size <= segment size = 373006550
Number of read and write commands whose size > segment size = 142985
Vendor (Seagate/Hitachi) factory information
number of hours powered up = 28987.73
number of minutes until next internal SMART test = 48
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 574211145 105 0 574211250 105 242574.514 0
write: 0 0 17 17 17 18073.098 0
verify: 252916 0 0 252916 0 0.526 0
Non-medium error count: 1269
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed 96 4 - [- - -]
# 2 Reserved(7) Completed 64 4 - [- - -]
Long (extended) Self Test duration: 47220 seconds [787.0 minutes]
$ sudo smartctl --all /dev/sdj1
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-66-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: SEAGATE
Product: ST8000NM0075
Revision: PS24
Compliance: SPC-4
User Capacity: 8,001,563,222,016 bytes [8.00 TB]
Logical block size: 512 bytes
Physical block size: 4096 bytes
Formatted with type 2 protection
LU is fully provisioned
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000c50085823d2b
Serial number: ZA12BNXA
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Fri Nov 8 10:26:24 2019 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Disabled or Not Supported
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 47 C
Drive Trip Temperature: 60 C
Manufactured in week 23 of year 2016
Specified cycle count over device lifetime: 10000
Accumulated start-stop cycles: 148
Specified load-unload count over device lifetime: 300000
Accumulated load-unload cycles: 1364
Elements in grown defect list: 0
Vendor (Seagate) cache information
Blocks sent to initiator = 4179446744
Blocks received from initiator = 2703674280
Blocks read from cache and sent to initiator = 2799660441
Number of read and write commands whose size <= segment size = 334518430
Number of read and write commands whose size > segment size = 131599
Vendor (Seagate/Hitachi) factory information
number of hours powered up = 28987.73
number of minutes until next internal SMART test = 43
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 4216128253 9 0 4216128262 9 214344.135 0
write: 0 0 4 4 4 17073.614 0
verify: 269974 0 0 269974 0 0.562 0
Non-medium error count: 570
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed 96 4 - [- - -]
# 2 Reserved(7) Completed 64 4 - [- - -]
Long (extended) Self Test duration: 47220 seconds [787.0 minutes]
$ sudo smartctl --all /dev/sdg1
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-66-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: SEAGATE
Product: ST8000NM0075
Revision: PS24
Compliance: SPC-4
User Capacity: 8,001,563,222,016 bytes [8.00 TB]
Logical block size: 512 bytes
Physical block size: 4096 bytes
Formatted with type 2 protection
LU is fully provisioned
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000c5008581aafb
Serial number: ZA12CXW2
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Fri Nov 8 10:26:28 2019 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Disabled or Not Supported
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 59 C
Drive Trip Temperature: 60 C
Manufactured in week 23 of year 2016
Specified cycle count over device lifetime: 10000
Accumulated start-stop cycles: 148
Specified load-unload count over device lifetime: 300000
Accumulated load-unload cycles: 1334
Elements in grown defect list: 0
Vendor (Seagate) cache information
Blocks sent to initiator = 2845390680
Blocks received from initiator = 1453787448
Blocks read from cache and sent to initiator = 3178782010
Number of read and write commands whose size <= segment size = 376760133
Number of read and write commands whose size > segment size = 148599
Vendor (Seagate/Hitachi) factory information
number of hours powered up = 28987.77
number of minutes until next internal SMART test = 39
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 704945336 2 0 704945338 2 244917.683 0
write: 0 0 73 73 73 18665.495 0
verify: 320880 0 0 320880 0 0.667 0
Non-medium error count: 1242
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed 96 4 - [- - -]
# 2 Reserved(7) Completed 64 4 - [- - -]
Long (extended) Self Test duration: 47220 seconds [787.0 minutes
他们都在这里
当前的:
sda wwn-0x5000c500858211b3
sdb wwn-0x5000c5008581b953
sdc wwn-0x5000c50085825ec7
sdd wwn-0x5000c5008581e423
sdf wwn-0x5000c5008581b61b
sdg wwn-0x5000c5008581aafb *
sdh wwn-0x5000c5008581cc03 *
sdi wwn-0x5000cca267ab0de4
sdk wwn-0x5000c5008581b933 *
sdl wwn-0x5000c5008581bdf7 *
sdm wwn-0x5000c50085820b93 *
sdn wwn-0x5000c5008581b79f *
sdo wwn-0x5000c500858252ef *
sdp wwn-0x5000c5008581fd3f
sdq wnn-0x61866da05f3bc2001f1c1a0d117e72cf
答案1
内核环形缓冲区里有什么?你能发布相关的片段吗dmesg -T
?
尝试zpool clear
清除瞬态错误。
这些都是 SAS 磁盘吗?或者您的环境中是否混入了 SATA?
编辑 SATA 驱动器的设备超时由于某些原因。
echo 180 > /sys/block/sdX/device/timeout
sdX
设备在哪里。
然后运行zpool clear
并查看一切是否正常恢复。