今天我的家庭服务器出现内核崩溃,系统驱动器出了问题。我更换了驱动器,恢复了服务器,现在我正试图弄清楚旧驱动器出了什么问题。实际上是相当旧了,所以我猜想这可能是硬件故障,但我仍然想尝试学习一些有关恢复技术的知识(并找出 SMART 没有警告我的原因)。我现在可以将驱动器视为 /dev/sdb,并且可以在那里检测到 lvm,因此我将 ubuntu-vg 重命名为 ubuntu-vg-old 并激活它。
root@calcium:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
ubuntu-lv ubuntu-vg -wi-ao---- <29.06g
backups ubuntu-vg-old -wi-a----- 1.29t
ubuntu-lv ubuntu-vg-old -wi-a----- 200.00g
不幸的是,安装它不起作用,并且在长时间超时后命令失败导致驱动器无法访问:
root@calcium:~# mount /dev/ubuntu-vg-old/ubuntu-lv /mnt -o ro,user
mount: /mnt: can't read superblock on /dev/mapper/ubuntu--vg--old-ubuntu--lv.
root@calcium:~# pvscan
Error reading device /dev/sdb at 0 length 512.
Error reading device /dev/sdb at 0 length 4096.
Error reading device /dev/sdb1 at 0 length 4096.
Error reading device /dev/sdb2 at 0 length 4096.
Error reading device /dev/sdb3 at 0 length 4096.
PV /dev/sda3 VG ubuntu-vg lvm2 [58.12 GiB / 29.06 GiB free]
Total: 1 [58.12 GiB] / in use: 1 [58.12 GiB] / in no VG: 0 [0 ]
重启后(我找不到其他方法让它再次可访问),驱动器恢复了。我尝试修复它:
root@calcium:~# fsck /dev/mapper/ubuntu--vg--old-ubuntu--lv
fsck from util-linux 2.36.1
e2fsck 1.46.3 (27-Jul-2021)
/dev/mapper/ubuntu--vg--old-ubuntu--lv: recovering journal
fsck.ext4: Input/output error while trying to re-open /dev/mapper/ubuntu--vg--old-ubuntu--lv
/dev/mapper/ubuntu--vg--old-ubuntu--lv: ********** WARNING: Filesystem still has errors **********
但这与挂载完全相同,长时间超时并且驱动器从系统中删除。我整夜运行了 SMART 离线表面测试 ( smartctl -t offline /dev/sdb
),它没有发现任何问题,也没有更改任何离线 SMART 属性。坏块读取测试也运行良好,没有错误:
root@calcium:~# badblocks -b 4096 -c 1024 -s -o bb.out /dev/sdb
Checking for bad blocks (read-only test): done
因此,我尝试使用 badblocks () 进行非破坏性读写测试badblocks -b 4096 -c 1024 -s -n -v /dev/sdb
,运行约半小时后,驱动器再次从系统中掉线。我已经更换了 SATA 电缆,并将驱动器连接到其他端口。显然,只有在以下情况下才会出现问题写作至特定部门。
在完全格式化之前我还可以尝试其他什么吗(我猜这很可能也会失败)?
智能数据:
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 414
2 Throughput_Performance 0x0026 055 051 000 Old_age Always - 18840
3 Spin_Up_Time 0x0023 077 066 025 Pre-fail Always - 7179
4 Start_Stop_Count 0x0032 094 094 000 Old_age Always - 6274
5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0
8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 31668
10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 2
12 Power_Cycle_Count 0x0032 098 098 000 Old_age Always - 2286
181 Program_Fail_Cnt_Total 0x0022 100 100 000 Old_age Always - 19262840
191 G-Sense_Error_Rate 0x0022 099 099 000 Old_age Always - 11132
192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0
194 Temperature_Celsius 0x0002 064 044 000 Old_age Always - 35 (Min/Max 14/56)
195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0
196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 087 083 000 Old_age Always - 1617
198 Offline_Uncorrectable 0x0030 252 084 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0036 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 235
223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 2
225 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 6320
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 31656 -
# 2 Short offline Completed without error 00% 31632 -
# 3 Short offline Completed: read failure 10% 31608 2541336840
# 4 Extended offline Completed without error 00% 31587 -
# 5 Short offline Completed without error 00% 31560 -
# 6 Short offline Completed without error 00% 31536 -
# 7 Short offline Completed without error 00% 31512 -
# 8 Short offline Completed without error 00% 31488 -
# 9 Short offline Completed without error 00% 31464 -
#10 Short offline Completed without error 00% 31440 -
#11 Extended offline Completed without error 00% 31419 -
#12 Short offline Completed without error 00% 31392 -
#13 Short offline Completed without error 00% 31368 -
#14 Short offline Completed without error 00% 31344 -
#15 Short offline Completed without error 00% 31320 -
#16 Short offline Completed without error 00% 31296 -
#17 Short offline Completed without error 00% 31272 -
#18 Extended offline Completed without error 00% 31251 -
#19 Short offline Completed without error 00% 31224 -
#20 Short offline Completed without error 00% 31200 -
#21 Short offline Completed without error 00% 31176 -