我被要求更换一个故障的硬盘,该硬盘被用作电视设置中的录音设备。
(2.5 英寸硬盘,简单 USB 接口,采用两部分塑料外壳)
由于它有一个 USB A 型电缆输出,因此只需将其插入 Ubuntu 笔记本电脑即可。
补充说明:USB 连接似乎运行正常,设备似乎只存在存储介质问题。因此,可以看到...
最大的文件似乎是 200MB 的加密流数据块。其余文件很可能是各种元数据;我甚至不会尝试解密其中的任何内容,这些录音是一组随机的电视节目,占空间的 7.5%。
“磁盘”说道:
- 型号:东芝 MQ01ABD050V -63 (AX0N1Q)
- 分区:500 GB,主引导记录,17MB 可用,然后是 500GB ext4 v1.0 分区。
- 评估:磁盘正常,16376 个坏扇区(29° C / 84° F)
除了“多次升级的读取错误”之外,还有更多的解释吗?
我怀疑造成破损的“驱动因素”是小型(甚至极小)的全封闭外壳,没有通风口;从而引起散热问题。
它可能也曾遭受过冲击,因为该设备在电视旁边放置了两年。清理灰尘时,哎呀!它掉在地上。
$ sudo smartctl -a /dev/sdb
[sudo] password for hannu:
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.13.0-37-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: TOSHIBA MQ01ABD050V -63
Serial Number: 885YC2J1TF6G
LU WWN Device Id: 5 000039 8b43822ba
Firmware Version: AX0N1Q
User Capacity: 500 107 862 016 bytes [500 GB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 2.6, 3.0 Gb/s (current: 1.5 Gb/s)
Local Time is: Wed Mar 30 19:53:04 2022 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 115) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 084 050 Pre-fail Always - 0
2 Throughput_Performance 0x0005 100 100 050 Pre-fail Offline - 0
3 Spin_Up_Time 0x0027 100 100 001 Pre-fail Always - 1125
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 200
5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - 10288
7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline - 0
9 Power_On_Hours 0x0032 033 033 000 Old_age Always - 26898
10 Spin_Retry_Count 0x0033 103 100 030 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 200
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 3
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 185
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 200
194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 27 (Min/Max 22/58)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 854
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 6088
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
220 Disk_Shift 0x0002 100 100 000 Old_age Always - 0
222 Loaded_Hours 0x0032 033 033 000 Old_age Always - 26898
223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 0
224 Load_Friction 0x0022 100 100 000 Old_age Always - 0
226 Load-in_Time 0x0026 100 100 000 Old_age Always - 178
240 Head_Flying_Hours 0x0001 100 100 001 Pre-fail Offline - 0
SMART Error Log Version: 1
ATA Error Count: 467 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 467 occurred at disk power-on lifetime: 26805 hours (1116 days + 21 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 b8 f0 73 13 4d Error: UNC 184 sectors at LBA = 0x0d1373f0 = 219378672
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 d5 08 a0 73 13 40 00 06:40:51.442 READ DMA EXT
25 d5 c0 e8 72 13 40 00 06:40:51.333 READ DMA EXT
25 d5 98 58 71 13 40 00 06:40:51.137 READ DMA EXT
25 d5 88 d8 6f 13 40 00 06:40:50.928 READ DMA EXT
25 d5 d0 10 6e 13 40 00 06:40:50.728 READ DMA EXT
Error 466 occurred at disk power-on lifetime: 26805 hours (1116 days + 21 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 18 e0 74 13 4d Error: UNC 24 sectors at LBA = 0x0d1374e0 = 219378912
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 d5 18 e0 74 13 40 00 06:38:34.673 READ DMA EXT
25 d5 48 a0 73 13 40 00 06:38:31.303 READ DMA EXT
25 d5 c0 e8 72 13 40 00 06:38:31.292 READ DMA EXT
25 d5 40 b0 71 13 40 00 06:38:31.083 READ DMA EXT
25 d5 30 88 6f 13 40 00 06:38:30.890 READ DMA EXT
Error 465 occurred at disk power-on lifetime: 26805 hours (1116 days + 21 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 f8 f0 73 13 4d Error: UNC 248 sectors at LBA = 0x0d1373f0 = 219378672
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 d5 48 a0 73 13 40 00 06:38:31.303 READ DMA EXT
25 d5 c0 e8 72 13 40 00 06:38:31.292 READ DMA EXT
25 d5 40 b0 71 13 40 00 06:38:31.083 READ DMA EXT
25 d5 30 88 6f 13 40 00 06:38:30.890 READ DMA EXT
25 d5 b8 d8 6d 13 40 00 06:38:30.688 READ DMA EXT
Error 464 occurred at disk power-on lifetime: 26798 hours (1116 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 06 c2 76 06 40 Error: UNC 6 sectors at LBA = 0x000676c2 = 423618
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 06 c2 76 06 40 00 00:00:20.982 READ DMA EXT
25 00 01 c1 76 06 40 00 00:00:17.605 READ DMA EXT
25 00 01 c0 76 06 40 00 00:00:14.221 READ DMA EXT
25 00 20 c0 76 06 40 00 00:00:10.840 READ DMA EXT
25 00 08 b8 76 06 40 00 00:00:10.839 READ DMA EXT
Error 463 occurred at disk power-on lifetime: 26798 hours (1116 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 01 c1 76 06 40 Error: UNC 1 sectors at LBA = 0x000676c1 = 423617
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 01 c1 76 06 40 00 00:00:17.605 READ DMA EXT
25 00 01 c0 76 06 40 00 00:00:14.221 READ DMA EXT
25 00 20 c0 76 06 40 00 00:00:10.840 READ DMA EXT
25 00 08 b8 76 06 40 00 00:00:10.839 READ DMA EXT
25 00 20 90 76 06 40 00 00:00:10.838 READ DMA EXT
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
$ smartctl -P showall /dev/sdb1
No presets are defined for this drive. Its identity strings:
MODEL: /dev/sdb1
FIRMWARE: (any)
do not match any of the known regular expressions.
答案1
Hannu,不要相信那些愚蠢的一维评估(红色、黄色、绿色)或一个短语结论,例如
评估:磁盘正常,16376 个坏扇区(29° C / 84° F)
有 16376 个坏扇区的磁盘不好!因为这表明预期寿命急剧下降。
此外,还有 6088 个无法读取的待处理扇区不好!也一样。
您的温度现在可能是 29°C,但已经达到 58°C,我们不知道持续了多久。您有 6088 个无法读取的扇区,其中 10288 个扇区已被替换。一旦出现无法读取的扇区,我就会更换驱动器。
G-Shock 参数可能表明您曾将驱动器摔过 3 次。不幸的是,我没有遇到过此特定参数。
以下是记录损害的相关报告:
供应商特定的 SMART 属性及阈值:
ID# ATTRIBUTE_NAME 标志值 最差阈值类型 已更新 WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0033 100 100 050 始终预故障 - 10288
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age 始终 - 3
194 温度_摄氏度 0x0022 100 100 000 Old_age 始终 - 27(最小/最大 22/58)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age 始终 - 854
197 Current_Pending_Sector 0x0032 100 100 000 Old_age 始终 - 6088
结论:
使用 ddrescue 复制您的驱动器或将其发送到专业的恢复实验室!
附言:如果您要使用 ddrescue 复制驱动器,能否链接日志文件 (mapfile)?这样 harrymc 就可以重新考虑他的说法了。谢谢。
答案2
磁盘的 SMART 指示器显示没有任何错误,没有坏扇区,什么都没有。就他们而言,磁盘状况良好。
对于不理解 SMART 的反对者,以下是来自 NTFS.com 的一段引文 SMART 属性:
属性值的范围是 1 到 253(1 代表最坏情况,253 代表最好情况)。根据制造商的不同,通常会选择 100 或 200 作为“正常”值。
对于大多数属性来说,高于此阈值的值是好的,意味着没有错误。
值得注意的是,您确实有 467 个 ATA 错误,类型为 READ DMA EXT。
根据这篇文章 ReadyNAS 中磁盘上的 ATA 错误增加:
当 ReadyNAS 的 SATA 控制器无法与硬盘通信时,就会发生 ATA 错误。
ReadyNAS 的 SATA 控制器向硬盘发送命令。当控制器无法与磁盘通信时,这可能是由于磁盘本身内部硬件错误导致的,可能需要更换。
这基本上意味着主板与磁盘连接存在问题。
此类错误会在磁盘的使用寿命内累积,并且时间戳不包含日期,因此无法确定错误发生的时间。
这可能是由 SATA 电缆损坏或磁盘问题引起的。尝试使用新电缆并运行 使用 smartctl 进行 SMART 测试. 这可以确定磁盘是否真的出现故障。
密切关注 ATA 错误数,看它是否仍在增加。