SMART 测试结果显示“主机中止”

SMART 测试结果显示“主机中止”

我已经在 Linux 系统本地磁盘上运行了长格式 SMART 测试,使用以下命令:

sudo smartctl -t long -d sat /dev/sdc

一旦我确定测试完成,我就会使用以下命令请求结果报告:

sudo smartctl -a -d sat /dev/sdc

完整报告如下所示。

我需要帮助解释结果,特别是短语的重复实例主机已中止,出现在报告末尾附近。更一般地说,我可以从报告中了解到哪些重要的结论?

smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.13.0-40-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     HGST HUS726T4TALA6L4
Serial Number:    V1G9RS3C
LU WWN Device Id: 5 000cca 0bcc46cb9
Firmware Version: VLGNW460
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon May 16 13:13:42 2022 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Status not supported: Incomplete response, ATA output registers missing
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (  25) The self-test routine was aborted by
                    the host.
Total time to complete Offline 
data collection:        (   87) seconds.
Offline data collection
capabilities:            (0x5b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    ( 503) minutes.
SCT capabilities:          (0x003d) SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   130   130   054    Pre-fail  Offline      -       100
  3 Spin_Up_Time            0x0007   133   133   024    Pre-fail  Always       -       317 (Average 317)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       41
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       213
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   128   128   020    Pre-fail  Offline      -       18
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       550
 10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       6
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       51
193 Load_Cycle_Count        0x0012   100   100   000    Old_age   Always       -       51
194 Temperature_Celsius     0x0002   253   253   000    Old_age   Always       -       21 (Min/Max 17/35)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       213
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       108
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0

SMART Error Log Version: 1
ATA Error Count: 256 (device log contains only the most recent five errors)
    CR = Command Register [HEX]
    FR = Features Register [HEX]
    SC = Sector Count Register [HEX]
    SN = Sector Number Register [HEX]
    CL = Cylinder Low Register [HEX]
    CH = Cylinder High Register [HEX]
    DH = Device/Head Register [HEX]
    DC = Device Command Register [HEX]
    ER = Error register [HEX]
    ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 256 occurred at disk power-on lifetime: 195 hours (8 days + 3 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 41 00 00 00 00 00  Error: UNC at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 30 90 ce 42 40 08   6d+07:12:00.580  READ FPDMA QUEUED
  60 08 28 80 cf 42 40 08   6d+07:11:58.132  READ FPDMA QUEUED
  60 08 20 78 cf 42 40 08   6d+07:11:58.131  READ FPDMA QUEUED
  60 08 18 70 cf 42 40 08   6d+07:11:58.131  READ FPDMA QUEUED
  60 08 10 68 cf 42 40 08   6d+07:11:58.131  READ FPDMA QUEUED

Error 255 occurred at disk power-on lifetime: 195 hours (8 days + 3 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 41 00 00 00 00 00  Error: UNC at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 40 88 90 ce 42 40 08   6d+07:11:56.959  READ FPDMA QUEUED
  60 30 70 30 5f 43 40 08   6d+07:11:56.959  READ FPDMA QUEUED
  60 e0 40 50 5c 43 40 08   6d+07:11:54.495  READ FPDMA QUEUED
  60 f8 38 58 59 43 40 08   6d+07:11:54.491  READ FPDMA QUEUED
  60 40 30 18 54 43 40 08   6d+07:11:54.491  READ FPDMA QUEUED

Error 254 occurred at disk power-on lifetime: 195 hours (8 days + 3 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 41 00 00 00 00 00  Error: UNC at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 40 c8 30 8b 38 40 08   6d+07:11:45.813  READ FPDMA QUEUED
  60 40 d8 b0 95 38 40 08   6d+07:11:43.319  READ FPDMA QUEUED
  60 40 d0 70 90 38 40 08   6d+07:11:43.319  READ FPDMA QUEUED
  60 40 c0 f0 85 38 40 08   6d+07:11:43.319  READ FPDMA QUEUED
  60 40 b8 b0 80 38 40 08   6d+07:11:43.319  READ FPDMA QUEUED

Error 253 occurred at disk power-on lifetime: 168 hours (7 days + 0 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 41 00 00 00 00 00  Error: UNC at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 60 e0 4c 44 40 08   5d+04:30:26.393  READ FPDMA QUEUED
  61 10 d0 28 55 22 40 08   5d+04:30:23.408  WRITE FPDMA QUEUED
  60 08 c8 e8 5d b1 40 08   5d+04:30:23.408  READ FPDMA QUEUED
  60 28 c0 78 c6 44 40 08   5d+04:30:23.408  READ FPDMA QUEUED
  60 20 b8 58 c1 44 40 08   5d+04:30:23.408  READ FPDMA QUEUED

Error 252 occurred at disk power-on lifetime: 168 hours (7 days + 0 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 41 00 00 00 00 00  Error: UNC at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 40 58 a8 47 44 40 08   5d+04:30:21.872  READ FPDMA QUEUED
  61 08 f0 d0 4d 43 40 08   5d+04:30:17.710  WRITE FPDMA QUEUED
  61 08 e8 a0 27 43 40 08   5d+04:30:17.667  WRITE FPDMA QUEUED
  60 08 e0 e0 5f 90 40 08   5d+04:30:17.667  READ FPDMA QUEUED
  60 08 d8 e0 df 90 40 08   5d+04:30:17.667  READ FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Aborted by host               90%       526         -
# 2  Extended offline    Aborted by host               90%       442         -
# 3  Extended offline    Aborted by host               90%       284         -
# 4  Extended offline    Aborted by host               90%       254         -
# 5  Extended offline    Aborted by host               90%       254         -
# 6  Extended offline    Aborted by host               90%       245         -
# 7  Extended offline    Aborted by host               90%       243         -
# 8  Extended offline    Aborted by host               90%       241         -
# 9  Extended offline    Aborted by host               90%       241         -
#10  Short offline       Completed without error       00%        52         -
#11  Short offline       Completed without error       00%        24         -
#12  Short offline       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

答案1

以下属性表明您的驱动器即将报废,您应该更换它。您当前共有 108 个无法读取的扇区(可能更多)。213 个扇区已被备用扇区替换。

供应商特定的带阈值的 SMART 属性:ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED

5 Reallocated_Sector_Ct 0x0033 100 100 005 始终预故障 - 213

196 Reallocated_Event_Count 0x0032 100 100 000 Old_age 始终 - 213

197 Current_Pending_Sector 0x0022 100 100 000 Old_age 始终 - 108

如果您没有驱动器的副本,则应考虑使用容错软件(如 Linux 上的 ddrescue)复制它。Gparted、System Rescue CD、Knoppix 等 Live Linux 版本已内置 ddrescue。

某人或某事中止了您的扩展离线报告。您有一个非常新的硬盘,它显示严重错误,我想知道为什么。您把它摔了?您在它仍在写入时关闭了它吗?

运行离线报告可能会让固件发现一些额外的待处理(不可读)扇区,但您实际上应该从智能参数的完整报告开始。报告可能已经表明驱动器即将发生故障,从而导致您复制驱动器,而不是运行扩展离线测试的多个实例。

尽早创建副本可以减少已损坏的驱动器的压力。

答案2

我遇到了类似的问题。我重新启动,关闭所有程序并断开互联网连接。然后我没有收到任何“主机中止”信息。如果错误继续出现,您可以使用 fsck 实用程序检查错误,甚至可以修复一些问题,使用 (df -h) 来帮助找出驱动器。

  1. 确保驱动器未安装(sudo umount /dev/sdX)
  2. 试运行是个好主意(sudo fsck -N /dev/sdX)
  3. 错误检查(sudo fsck /dev/sdX)
  4. 重新启动、卸载并再次运行但自动修复(sudo fsck -y /dev/sdX)或仅检查错误而不修复(sudo fsck -n /dev/sdX)

如果这有帮助的话请告知我们。

相关内容