硬盘问题,smartctl输出解读

硬盘问题,smartctl输出解读

我需要帮助解释 smartctl 的输出。我做了 smartctl 检查,因为本周断电后,我的 ubuntu 系统在启动 infsck 时首先出现一些错误,然后出现蓝屏。smartctl 的输出是:

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (  139) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (  99) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x1081) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED   WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate     0x000f   114   099   006    Pre-fail  Always       -       59491560
3 Spin_Up_Time            0x0003   099   099   000    Pre-fail  Always       -       0
4 Start_Stop_Count        0x0032   099   099   020    Old_age   Always       -       1921
5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
7 Seek_Error_Rate         0x000f   067   060   030    Pre-fail  Always       -       6246697
9 Power_On_Hours          0x0032   098   098   000    Old_age   Always       -       2405
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   099   099   020    Old_age   Always       -       1319
184 End-to-End_Error        0x0032   099   099   099    Old_age   Always   FAILING_NOW 1
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   099   000    Old_age   Always       -       4
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   060   054   045    Old_age   Always       -       40 (Min/Max 26/46)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       128
193 Load_Cycle_Count        0x0032   098   098   000    Old_age   Always       -       4619
194 Temperature_Celsius     0x0022   040   046   000    Old_age   Always       -       40 (0 16 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
254 Free_Fall_Sensor        0x0032   100   100   000    Old_age   Always       -       0

SMART Error Log Version: 1
ATA Error Count: 1
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1 occurred at disk power-on lifetime: 2383 hours (99 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 d8 fd 07 00  Error: UNC at LBA = 0x0007fdd8 = 523736

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
-- -- -- -- -- -- -- --  ----------------  --------------------
60 00 00 88 fe 07 40 00      00:00:18.576  READ FPDMA QUEUED
60 00 00 88 fd 07 40 00      00:00:18.575  READ FPDMA QUEUED
60 00 00 88 fc 07 40 00      00:00:18.574  READ FPDMA QUEUED
60 00 00 88 fb 07 40 00      00:00:18.571  READ FPDMA QUEUED
60 00 00 88 fa 07 40 00      00:00:18.569  READ FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      2405         -
# 2  Extended offline    Aborted by host               90%      2403         -
# 3  Extended offline    Completed without error       00%      2403         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
 1        0        0  Not_testing
 2        0        0  Not_testing
 3        0        0  Not_testing
 4        0        0  Not_testing
 5        0        0  Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

答案1

我关心的是:

SMART overall-health self-assessment test result: PASSED  

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED   WHEN_FAILED RAW_VALUE  
1 Raw_Read_Error_Rate     0x000f   114   099   006    Pre-fail  Always       -       59491560  

这看上去似乎是一个很大的价值。

4 Start_Stop_Count        0x0032   099   099   020    Old_age   Always       -       1921  

功率增加和减少很多?

7 Seek_Error_Rate         0x000f   067   060   030    Pre-fail  Always       -       6246697  

再次,看起来很大。

9 Power_On_Hours          0x0032   098   098   000    Old_age   Always       -       2405  

查看下面的 ATA 错误时使用 2405。

184 End-to-End_Error        0x0032   099   099   099    Old_age   Always   FAILING_NOW 1  

这很可怕,是吗?

SMART Error Log Version: 1  
ATA Error Count: 1  

Error 1 occurred at disk power-on lifetime: 2383 hours (99 days + 7 hours)  

2405-2383 = 自 ATA 错误 1 ​​以来已过去 22 小时

Error: UNC at LBA = 0x0007fdd8 = 523736  

它尝试读取磁盘块 523736,并出现 UNC 错误。

建议:(YMMV,不负责,...)从 Live USB/CD 启动并fsck再次运行。

开始计划更换磁盘,确保您有可以恢复的最新备份。等待 s fsck,我会担心。

相关内容