我的硬盘出现故障了吗? / 需要 smartctl -a 输出的帮助

我的硬盘出现故障了吗? / 需要 smartctl -a 输出的帮助

我有一个旧的希捷 4TB 内置硬盘,来自一台废弃的电脑,我打算将其重新用作游戏的备用硬盘。

为了安全起见,我想先对其运行一些 smartctl 扫描,所以我这样做了smartctl -t short /dev/sdb并得到了结果。它们对我来说看起来不错,因为我没有看到“WHEN_FAILED”列中列出的任何内容(最初我主要关心与温度相关的错误)。但后来我看到2018年的一篇文章 提到“Current_Pending_Sector”非常严重...而我的不为零...而且我确实有一些错误...因为我无法真正理解是否要关心它们,我想我会尝试SE。

到目前为止,我最好的猜测是,我不应该在它上面放置任何关键的东西,但是如果我对保存文件夹进行符号链接,那么它们可能就可以用于游戏,这样它们就存在于其他地方(在具有更好智能结果的驱动器上)并且不存在如果驱动器出现故障,请介意重新下载已安装的游戏。也不确定“READ DMA EXT”错误是否表明即将发生故障,或者是否可能是电缆或其他一次性事件(我只能看到错误 35-39,它们都发生在“16936 小时”...不确定是否有办法查看所有错误,或者是否像它所说的那样仅存储最后 5 个错误)。 OTOH,我在安装它或从它复制数据时没有任何问题(这是一个亲戚的,他们不再想要它了;只是其中的一些图片/视频)。

如果至少有相当大的可能性驱动器可能还有一些寿命,我不介意用它来换一些不那么重要的东西。但它很可能在不久的将来失败,我宁愿不浪费任何时间在它上面,除了购买一块新磁铁:-) 有什么建议/建议吗?

不管怎样,我重新跑了,smartctl -t long /dev/sdb等到第二天再跑smartctl -a /dev/sdb。以下是结果:

I_AM_ROOT@fedora35:~
# smartctl -a /dev/sdb
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.7-200.fc35.x86_64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Desktop HDD.15
Device Model:     ST4000DM000-1F2168
Serial Number:    <Redacted>
LU WWN Device Id: <Redacted>
Firmware Version: CC54
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5900 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Fri Dec 17 11:44:49 2021 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 118) The previous self-test completed having
                    the read element of the test failed.
Total time to complete Offline 
data collection:        (  168) seconds.
Offline data collection
capabilities:            (0x73) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   1) minutes.
Extended self-test routine
recommended polling time:    ( 528) minutes.
Conveyance self-test routine
recommended polling time:    (   2) minutes.
SCT capabilities:          (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   119   099   006    Pre-fail  Always       -       233492808
  3 Spin_Up_Time            0x0003   092   091   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   099   099   020    Old_age   Always       -       1890
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   044   039   030    Pre-fail  Always       -       678608011490
  9 Power_On_Hours          0x0032   065   065   000    Old_age   Always       -       30836
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   099   099   020    Old_age   Always       -       1206
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   061   061   000    Old_age   Always       -       39
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   071   058   045    Old_age   Always       -       29 (Min/Max 27/32)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       304
193 Load_Cycle_Count        0x0032   084   084   000    Old_age   Always       -       32204
194 Temperature_Celsius     0x0022   029   042   000    Old_age   Always       -       29 (0 12 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       16
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       16
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       23293h+16m+41.533s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       19236444339
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       27220280383

SMART Error Log Version: 1
ATA Error Count: 39 (device log contains only the most recent five errors)
    CR = Command Register [HEX]
    FR = Features Register [HEX]
    SC = Sector Count Register [HEX]
    SN = Sector Number Register [HEX]
    CL = Cylinder Low Register [HEX]
    CH = Cylinder High Register [HEX]
    DH = Device/Head Register [HEX]
    DC = Device Command Register [HEX]
    ER = Error register [HEX]
    ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 39 occurred at disk power-on lifetime: 16936 hours (705 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 08 ff ff ff ef 00      04:59:22.764  READ DMA EXT
  25 00 40 ff ff ff ef 00      04:59:22.762  READ DMA EXT
  25 00 00 ff ff ff ef 00      04:59:22.736  READ DMA EXT
  25 00 08 ff ff ff ef 00      04:59:22.735  READ DMA EXT
  ef 10 02 00 00 00 a0 00      04:59:22.735  SET FEATURES [Enable SATA feature]

Error 38 occurred at disk power-on lifetime: 16936 hours (705 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 08 ff ff ff ef 00      04:59:18.709  READ DMA EXT
  25 00 00 ff ff ff ef 00      04:59:18.696  READ DMA EXT
  25 00 00 ff ff ff ef 00      04:59:18.693  READ DMA EXT
  25 00 00 ff ff ff ef 00      04:59:18.631  READ DMA EXT
  25 00 08 ff ff ff ef 00      04:59:18.631  READ DMA EXT

Error 37 occurred at disk power-on lifetime: 16936 hours (705 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 08 ff ff ff ef 00      04:57:53.914  READ DMA EXT
  25 00 08 ff ff ff ef 00      04:57:53.914  READ DMA EXT
  25 00 00 ff ff ff ef 00      04:57:53.882  READ DMA EXT
  ef 10 02 00 00 00 a0 00      04:57:53.881  SET FEATURES [Enable SATA feature]
  27 00 00 00 00 00 e0 00      04:57:53.881  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]

Error 36 occurred at disk power-on lifetime: 16936 hours (705 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 08 ff ff ff ef 00      04:57:49.903  READ DMA EXT
  25 00 08 ff ff ff ef 00      04:57:49.903  READ DMA EXT
  25 00 08 ff ff ff ef 00      04:57:49.903  READ DMA EXT
  25 00 08 ff ff ff ef 00      04:57:49.903  READ DMA EXT
  25 00 08 ff ff ff ef 00      04:57:49.903  READ DMA EXT

Error 35 occurred at disk power-on lifetime: 16936 hours (705 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 00 ff ff ff ef 00      04:57:45.210  READ DMA EXT
  25 00 00 ff ff ff ef 00      04:57:45.181  READ DMA EXT
  25 00 00 ff ff ff ef 00      04:57:45.179  READ DMA EXT
  25 00 00 ff ff ff ef 00      04:57:45.178  READ DMA EXT
  25 00 58 ff ff ff ef 00      04:57:45.149  READ DMA EXT

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       60%     30817         3723785408
# 2  Short offline       Completed without error       00%     30812         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

答案1

SMART 还不错,但您的 HDD 固件无法修复某些错误。

考虑到您现在完全知道哪个扇区损坏了,您可以尝试使用 dd 重新分配它:

https://www.smartmontools.org/wiki/BadBlockHowto

尽管如此,该硬盘现在使用起来并不安全,因此即使您设法修复它,也请考虑仅使用该硬盘来存储您可能会丢失的非必要信息。

相关内容