我的 SSD 坏了,还是我的操作系统出了问题?能帮我解释一下 smartctl 的输出吗?

我的 SSD 坏了,还是我的操作系统出了问题?能帮我解释一下 smartctl 的输出吗?

我有时会遇到 ubuntu 无法正常关闭的问题。每次发生这种情况时,我都需要运行手动 fsck 才能重新启动。

smartctl 短测试完成且没有任何错误,但扩展测试因“致命或未知错误”而失败。我的 SSD 大约有 1.5 年了。

我的主要问题是这是否是 SSD 的问题,或者可能是操作系统的问题。在这种情况下,我是否必须更换 SSD?

除此之外,我仍然想知道 smartctl 检查中的错误说明了什么:)

我见过类似的问题,但没有最终的解决方案。自动 fsck 很酷,但不能解决我的问题 :)

多谢 :)

关机失败输出

systemctl -x 输出:

smartctl 6.6 2016-05-31 r4324 [x86_64-linux-5.4.0-51-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     SanDisk SSD PLUS 1000GB
Serial Number:    190532804063
LU WWN Device Id: 5 001b44 8b922495a
Firmware Version: UH5100RL
User Capacity:    1.000.207.286.272 bytes [1,00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2 T13/2015-D revision 3
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Oct 19 00:55:01 2020 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     254 (maximum performance)
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, frozen [SEC2]
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (  57) A fatal error or unknown test error
                    occurred while the device was executing
                    its self-test routine and the device 
                    was unable to complete the self-test 
                    routine.
Total time to complete Offline 
data collection:        (  120) seconds.
Offline data collection
capabilities:            (0x15) SMART execute Offline immediate.
                    No Auto Offline data collection support.
                    Abort Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    No Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    ( 182) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  5 Reallocated_Sector_Ct   -O--CK   100   100   000    -    0
  9 Power_On_Hours          -O--CK   100   100   000    -    5064
 12 Power_Cycle_Count       -O--CK   100   100   000    -    477
165 Unknown_Attribute       -O--CK   100   100   000    -    231
166 Unknown_Attribute       -O--CK   100   100   ---    -    1
167 Unknown_Attribute       -O--CK   100   100   ---    -    0
168 Unknown_Attribute       -O--CK   100   100   ---    -    3
169 Unknown_Attribute       -O--CK   100   100   ---    -    1809
170 Unknown_Attribute       -O--CK   100   100   ---    -    0
171 Unknown_Attribute       -O--CK   100   100   000    -    0
172 Unknown_Attribute       -O--CK   100   100   000    -    0
173 Unknown_Attribute       -O--CK   100   100   000    -    1
174 Unknown_Attribute       -O--CK   100   100   000    -    125
184 End-to-End_Error        -O--CK   100   100   ---    -    0
187 Reported_Uncorrect      -O--CK   100   100   000    -    41
188 Command_Timeout         -O--CK   100   100   ---    -    0
194 Temperature_Celsius     -O---K   065   059   000    -    35 (Min/Max 13/59)
199 UDMA_CRC_Error_Count    -O--CK   100   100   ---    -    0
230 Unknown_SSD_Attribute   -O--CK   100   100   000    -    283469152322
232 Available_Reservd_Space PO--CK   100   100   005    -    100
233 Media_Wearout_Indicator -O--CK   100   100   ---    -    790
234 Unknown_Attribute       -O--CK   100   100   000    -    3614
241 Total_LBAs_Written      ----CK   100   100   000    -    1492
242 Total_LBAs_Read         ----CK   100   100   000    -    636
244 Unknown_Attribute       -O--CK   000   100   ---    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      1  Comprehensive SMART error log
0x03       GPL     R/O     16  Ext. Comprehensive SMART error log
0x04       GPL,SL  R/O      8  Device Statistics log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x10       GPL     R/O      1  SATA NCQ Queued Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa1       GPL,SL  VS       1  Device vendor specific log
0xa2       GPL,SL  VS       2  Device vendor specific log
0xa3-0xa4  GPL,SL  VS       1  Device vendor specific log
0xa7       GPL,SL  VS       1  Device vendor specific log
0xa9       GPL,SL  VS       3  Device vendor specific log

SMART Extended Comprehensive Error Log Version: 1 (16 sectors)
Device Error Count: 43
    CR     = Command Register
    FEATR  = Features Register
    COUNT  = Count (was: Sector Count) Register
    LBA_48 = Upper bytes of LBA High/Mid/Low Registers ]  ATA-8
    LH     = LBA High (was: Cylinder High) Register    ]   LBA
    LM     = LBA Mid (was: Cylinder Low) Register      ] Register
    LL     = LBA Low (was: Sector Number) Register     ]
    DV     = Device (was: Device/Head) Register
    DC     = Device Control Register
    ER     = Error register
    ST     = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 43 [2] occurred at disk power-on lifetime: 5061 hours (210 days + 21 hours)
  When the command that caused the error occurred, the device was doing SMART Offline or Self-test.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 01 5d cc 80 a0 00

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  ea 00 00 00 00 00 00 00 00 00 00 a0 00     05:22:04.063  FLUSH CACHE EXT
  ea 00 00 00 00 00 00 00 00 00 00 a0 00     00:00:52.801  FLUSH CACHE EXT
  2f 00 00 00 01 00 00 00 00 08 30 a0 00     00:00:46.940  READ LOG EXT
  ea 00 00 00 00 00 00 00 00 00 00 a0 00     00:00:46.849  FLUSH CACHE EXT
  ea 00 00 00 00 00 00 00 00 00 00 a0 00     00:00:38.770  FLUSH CACHE EXT

Error 42 [1] occurred at disk power-on lifetime: 5055 hours (210 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 04 00 00 01 5d cc 84 a0 00

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  ea 00 00 00 00 00 00 00 00 00 00 a0 00     00:00:52.801  FLUSH CACHE EXT
  2f 00 00 00 01 00 00 00 00 08 30 a0 00     00:00:46.940  READ LOG EXT
  ea 00 00 00 00 00 00 00 00 00 00 a0 00     00:00:46.849  FLUSH CACHE EXT
  ea 00 00 00 00 00 00 00 00 00 00 a0 00     00:00:38.770  FLUSH CACHE EXT
  2f 00 00 00 01 00 00 00 00 08 30 a0 00     00:00:58.502  READ LOG EXT

Error 41 [0] occurred at disk power-on lifetime: 5055 hours (210 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 01 5d cc 80 a0 00  Error: UNC

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  2f 00 00 00 01 00 00 00 00 08 30 a0 00     00:00:46.940  READ LOG EXT
  ea 00 00 00 00 00 00 00 00 00 00 a0 00     00:00:46.849  FLUSH CACHE EXT
  ea 00 00 00 00 00 00 00 00 00 00 a0 00     00:00:38.770  FLUSH CACHE EXT
  2f 00 00 00 01 00 00 00 00 08 30 a0 00     00:00:58.502  READ LOG EXT
  ea 00 00 00 00 00 00 00 00 00 00 a0 00     00:00:58.422  FLUSH CACHE EXT

Warning! SMART Extended Comprehensive Error Log Structure error: invalid SMART checksum.
Error 40 [63] occurred at disk power-on lifetime: 61166 hours (2548 days + 14 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  00 -- 00 00 00 00 00 00 00 00 00 00 00

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  00 00 00 00 00 00 00 00 00 00 00 00 45 16d+13:40:55.765  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 44 13d+06:08:44.612  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 43  9d+22:36:33.459  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 42  6d+15:04:22.306  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 41  3d+07:32:11.153  NOP [Abort queued commands]

Error 39 [62] occurred at disk power-on lifetime: 61166 hours (2548 days + 14 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  00 -- 00 00 00 00 00 00 00 00 00 00 00

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  00 00 00 00 00 00 00 00 00 00 00 00 35 16d+13:40:55.765  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 34 13d+06:08:44.612  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 33  9d+22:36:33.459  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 32  6d+15:04:22.306  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 31  3d+07:32:11.153  NOP [Abort queued commands]

Error 38 [61] occurred at disk power-on lifetime: 61166 hours (2548 days + 14 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  00 -- 00 00 00 00 00 00 00 00 00 00 00

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  00 00 00 00 00 00 00 00 00 00 00 00 25 16d+13:40:55.765  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 24 13d+06:08:44.612  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 23  9d+22:36:33.459  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 22  6d+15:04:22.306  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 21  3d+07:32:11.153  NOP [Abort queued commands]

Error 37 [60] occurred at disk power-on lifetime: 43981 hours (1832 days + 13 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  00 -- 00 00 00 00 00 00 00 00 00 00 00

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  00 00 00 00 00 00 00 00 00 00 00 00 15 16d+13:40:55.765  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 14 13d+06:08:44.612  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 13  9d+22:36:33.459  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 12  6d+15:04:22.306  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 11  3d+07:32:11.153  NOP [Abort queued commands]

Warning! SMART Extended Comprehensive Error Log Structure error: invalid SMART checksum.
Error 36 [59] occurred at disk power-on lifetime: 61166 hours (2548 days + 14 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  00 -- 00 00 00 00 00 00 00 00 00 00 00

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  00 00 00 00 00 00 00 00 00 00 00 00 45 16d+13:40:55.765  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 44 13d+06:08:44.612  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 43  9d+22:36:33.459  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 42  6d+15:04:22.306  NOP [Abort queued commands]
  00 00 00 00 00 00 00 00 00 00 00 00 41  3d+07:32:11.153  NOP [Abort queued commands]

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Fatal or unknown error        90%      5061         0
# 2  Short offline       Completed without error       00%      5060         -
# 3  Extended offline    Fatal or unknown error        90%      4923         0
# 4  Short offline       Completed without error       00%      4859         -
# 5  Short offline       Self-test routine in progress 70%      4859         -

Selective Self-tests/Logging not supported

SCT Commands not supported

Device Statistics (GP Log 0x04)
Page  Offset Size        Value Flags Description
0x01  =====  =               =  ===  == General Statistics (rev 1) ==
0x01  0x008  4             477  ---  Lifetime Power-On Resets
0x01  0x010  4            5064  ---  Power-on Hours
0x01  0x018  6      3129152158  ---  Logical Sectors Written
0x01  0x028  6      1335439411  ---  Logical Sectors Read
0x01  0x038  6            5065  ---  Date and Time TimeStamp
0x05  =====  =               =  ===  == Temperature Statistics (rev 1) ==
0x05  0x008  1              35  ---  Current Temperature
0x05  0x010  1               -  ---  Average Short Term Temperature
0x05  0x018  1               -  ---  Average Long Term Temperature
0x05  0x020  1              58  ---  Highest Temperature
0x05  0x028  1              25  ---  Lowest Temperature
0x05  0x030  1              43  ---  Highest Average Short Term Temperature
0x05  0x038  1              43  ---  Lowest Average Short Term Temperature
0x05  0x040  1               -  ---  Highest Average Long Term Temperature
0x05  0x048  1               -  ---  Lowest Average Long Term Temperature
0x05  0x050  4               0  ---  Time in Over-Temperature
0x05  0x058  1              95  ---  Specified Maximum Operating Temperature
0x05  0x060  4               0  ---  Time in Under-Temperature
0x05  0x068  1               0  ---  Specified Minimum Operating Temperature
0x07  =====  =               =  ===  == Solid State Device Statistics (rev 1) ==
0x07  0x008  1               0  N--  Percentage Used Endurance Indicator
                                |||_ C monitored condition met
                                ||__ D supports DSN
                                |___ N normalized value

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0009  2            3  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2            4  Device-to-host register FISes sent due to a COMRESET
0x000f  2            0  R_ERR response for host-to-device data FIS, CRC
0x0012  2            0  R_ERR response for host-to-device non-data FIS, CRC
0x0001  2            0  Command failed due to ICRC error

答案1

我的主要问题是这是否是 SSD 的问题,或者可能是操作系统的问题。在这种情况下,我是否必须更换 SSD?

这是 SSD 的问题。最好的做法是立即备份您珍贵的资料,并决定如何更换 SSD。

相关内容