修复坏块

修复坏块

得到后

WARNING: Your hard drive is failing
Device: /dev/sdb [SAT], 1 Offline uncorrectable sectors

我跑

$ sudo smartctl -a /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-514.26.2.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     KingDian S200 60GB
Serial Number:    2017022100551
LU WWN Device Id: 0 000000 000000000
Firmware Version: P0707F1
User Capacity:    60,022,480,896 bytes [60.0 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2 T13/2015-D revision 3
SATA Version is:  SATA >3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Tue Oct  3 10:56:08 2017 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (  120) seconds.
Offline data collection
capabilities:            (0x11) SMART execute Offline immediate.
                    No Auto Offline data collection support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    No Selective Self-test supported.
SMART capabilities:            (0x0002) Does not save SMART data before
                    entering power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    (  10) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   100   100   050    Old_age   Always       -       0
  5 Reallocated_Sector_Ct   0x0032   100   100   050    Old_age   Always       -       3
  9 Power_On_Hours          0x0032   100   100   050    Old_age   Always       -       4486
 12 Power_Cycle_Count       0x0032   100   100   050    Old_age   Always       -       13
160 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       1
161 Unknown_Attribute       0x0033   100   100   050    Pre-fail  Always       -       98
163 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       0
164 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       9724
165 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       9
166 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       1
167 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       5
168 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       1500
169 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       100
175 Program_Fail_Count_Chip 0x0032   100   100   050    Old_age   Always       -       0
176 Erase_Fail_Count_Chip   0x0032   100   100   050    Old_age   Always       -       0
177 Wear_Leveling_Count     0x0032   100   100   050    Old_age   Always       -       9602
178 Used_Rsvd_Blk_Cnt_Chip  0x0032   100   100   050    Old_age   Always       -       3
181 Program_Fail_Cnt_Total  0x0032   100   100   050    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   050    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   050    Old_age   Always       -       13
194 Temperature_Celsius     0x0022   100   100   050    Old_age   Always       -       28
195 Hardware_ECC_Recovered  0x0032   100   100   050    Old_age   Always       -       3994818
196 Reallocated_Event_Count 0x0032   100   100   050    Old_age   Always       -       2414
197 Current_Pending_Sector  0x0032   100   100   050    Old_age   Always       -       3
198 Offline_Uncorrectable   0x0032   100   100   050    Old_age   Always       -       1
199 UDMA_CRC_Error_Count    0x0032   100   100   050    Old_age   Always       -       0
232 Available_Reservd_Space 0x0032   100   100   050    Old_age   Always       -       98
241 Total_LBAs_Written      0x0030   100   100   050    Old_age   Offline      -       36124
242 Total_LBAs_Read         0x0030   100   100   050    Old_age   Offline      -       10259
245 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       9799

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      4486         -

Selective Self-tests/Logging not supported

详细smartctl输出显示:

$ sudo smartctl -x /dev/sdb
smartctl 6.2 2017-02-27 r4394 [x86_64-linux-3.10.0-514.26.2.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     KingDian S200 60GB
Serial Number:    2017022100551
LU WWN Device Id: 0 000000 000000000
Firmware Version: P0707F1
User Capacity:    60,022,480,896 bytes [60.0 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2 T13/2015-D revision 3
SATA Version is:  SATA >3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Tue Oct  3 15:49:27 2017 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     128 (minimum power consumption without standby)
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, frozen [SEC2]
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (  120) seconds.
Offline data collection
capabilities:            (0x11) SMART execute Offline immediate.
                    No Auto Offline data collection support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    No Selective Self-test supported.
SMART capabilities:            (0x0002) Does not save SMART data before
                    entering power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    (  10) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     -O--CK   100   100   050    -    0
  5 Reallocated_Sector_Ct   -O--CK   100   100   050    -    3
  9 Power_On_Hours          -O--CK   100   100   050    -    4491
 12 Power_Cycle_Count       -O--CK   100   100   050    -    13
160 Unknown_Attribute       -O--CK   100   100   050    -    1
161 Unknown_Attribute       PO--CK   100   100   050    -    98
163 Unknown_Attribute       -O--CK   100   100   050    -    0
164 Unknown_Attribute       -O--CK   100   100   050    -    10068
165 Unknown_Attribute       -O--CK   100   100   050    -    9
166 Unknown_Attribute       -O--CK   100   100   050    -    1
167 Unknown_Attribute       -O--CK   100   100   050    -    5
168 Unknown_Attribute       -O--CK   100   100   050    -    1500
169 Unknown_Attribute       -O--CK   100   100   050    -    100
175 Program_Fail_Count_Chip -O--CK   100   100   050    -    0
176 Erase_Fail_Count_Chip   -O--CK   100   100   050    -    0
177 Wear_Leveling_Count     -O--CK   100   100   050    -    9687
178 Used_Rsvd_Blk_Cnt_Chip  -O--CK   100   100   050    -    3
181 Program_Fail_Cnt_Total  -O--CK   100   100   050    -    0
182 Erase_Fail_Count_Total  -O--CK   100   100   050    -    0
192 Power-Off_Retract_Count -O--CK   100   100   050    -    13
194 Temperature_Celsius     -O---K   100   100   050    -    28
195 Hardware_ECC_Recovered  -O--CK   100   100   050    -    4314392
196 Reallocated_Event_Count -O--CK   100   100   050    -    2667
197 Current_Pending_Sector  -O--CK   100   100   050    -    3
198 Offline_Uncorrectable   -O--CK   100   100   050    -    1
199 UDMA_CRC_Error_Count    -O--CK   100   100   050    -    0
232 Available_Reservd_Space -O--CK   100   100   050    -    98
241 Total_LBAs_Written      ----CK   100   100   050    -    36474
242 Total_LBAs_Read         ----CK   100   100   050    -    10529
245 Unknown_Attribute       -O--CK   100   100   050    -    10146
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      1  Comprehensive SMART error log
0x03       GPL     R/O      1  Ext. Comprehensive SMART error log
0x04       GPL,SL  R/O      8  Device Statistics log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters
0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xde       GPL     VS       8  Device vendor specific log

SMART Extended Comprehensive Error Log Version: 1 (1 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      4488         -
# 2  Extended offline    Completed without error       00%      4487         -
# 3  Extended offline    Completed without error       00%      4486         -

Selective Self-tests/Logging not supported

SCT Commands not supported

Device Statistics (GP Log 0x04)
Page Offset Size         Value  Description
  1  =====  =                =  == General Statistics (rev 1) ==
  1  0x008  4               13  Lifetime Power-On Resets
  1  0x010  4             4491  Power-on Hours
  1  0x018  6       2390408669  Logical Sectors Written
  1  0x020  6         69617191  Number of Write Commands
  1  0x028  6        690041929  Logical Sectors Read
  1  0x030  6          6959725  Number of Read Commands
  7  =====  =                =  == Solid State Device Statistics (rev 1) ==
  7  0x008  1                0  Percentage Used Endurance Indicator

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  4            0  Command failed due to ICRC error
0x0002  4            0  R_ERR response for data FIS
0x0005  4            1  R_ERR response for non-data FIS
0x000a  4           17  Device-to-host register FISes sent due to a COMRESET

答案1

我过去也遇到过这个问题。 IIRC,“脱机不可纠正扇区”是指磁盘控制器(磁盘内部的控制器,而不是 PC 中的 SATA/SCSI 控制器)对某个扇区重复读取失败,并确定该扇区绝对无法使用。

那么,我必须声明该扇区对于使用它的文件系统来说是坏扇区吗?

不会。幸运的是,今天的磁盘会自动用备用扇区池中的好扇区替换坏扇区。因此,您不必向文件系统声明这些坏扇区,以便文件系统不再使用它们。当然,该池的大小是有限的(Available_Reservd_Space sectors我猜),一旦使用了所有备用扇区,坏扇区将仍然无法使用,您必须向您的文件系统声明它们。

那么,一切都很好,这是一条无害的消息?

并不真地。您的驱动器已多次尝试读取坏扇区,但每次都失败;因此它已排队等待更换,但驱动器无法自行完成此操作(它一直希望最终能够读取它)。在该扇区被新数据覆盖之前,它将保持“不可纠正”状态;一旦它被覆盖,或者如果驱动器以某种方式设法读取它,它将被重新映射并替换为备用扇区(在输出中smartctlOffline_Uncorrectable将减 1 并Reallocated_Sector_Ct增加 1)。

我能做些什么?

在这种情况下,我通常会强制 RAID 1 阵列重新同步(正常磁盘 -> 有故障的磁盘),以便新扇区具有正确的内容。无论如何fsck,如果您有该分区的备份(您应该),请将备份与实际内容进行比较。

答案2

运行一个长时间的smartctl测试,如果返回阳性错误,则说明有问题,如果没有,则使用硬盘应该没问题。

smartctl -t long /dev/sdb

注意:请记住在测试之前备份硬盘中的数据,根据驱动器的状况,测试带来的压力可能会进一步损坏您的磁盘。

使用磁盘扫描uncorrectable sectors如果您的测试有错误,请尝试修复smartctl

相关内容