Ubuntu 磁盘报告磁盘可能很快就会出现故障,但硬盘哨兵健康度为 85%,只有 8 个坏扇区:发生了什么?

Ubuntu 磁盘报告磁盘可能很快就会出现故障,但硬盘哨兵健康度为 85%,只有 8 个坏扇区:发生了什么?

我从 Ubuntu 磁盘 SMART 数据和自我测试中收到这条神秘消息“磁盘很可能很快会失效”。但是,当我检查自我测试结果窗口时,我发现仅 8 个部门已被重新定位(来自 120GB SSD 驱动器的 8*512 字节),并且没有以红色列出的值。

Ubuntu 磁盘 SSD SMART 数据在扩展自检后第一部分 Ubuntu 磁盘 SSD SMART 数据在扩展自检后第 II 部分

另外,当我运行 Ubuntu 的 HDSentinel 时,它报告磁盘健康状况合理,为 85%:

HDD Device  0: /dev/sda             
HDD Model ID : Hypertec SSD
HDD Serial No: HY22021100011
HDD Revision : U0202A0
HDD Size     : 114473 MB
Interface    : S-ATA Gen3, 6 Gbps
Temperature  : 40 °C
Highest Temp.: 40 °C
Health       : 85 %
Performance  : 100 %
Power on time: 95 days, 18 hours
Est. lifetime: more than 1000 days
Total written: 1.13 TB
  There are 8 bad sectors on the disk surface. The contents of these sectors were moved to the spare area.
  At this point, warranty replacement of the disk is not yet possible, only if the health drops further.
  It is recommended to examine the log of the disk regularly. All new problems found will be logged there.
    No actions needed.

已安装智能工具进而输入(来自检查 SSD 的参考)

  root@stephen-All-Series:~# sudo smartctl -a -d ata /dev/sda

然后我得到以下输出:

root@stephen-All-Series:~# sudo smartctl -a -d ata /dev/sda

smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.19.0-46-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     Hypertec SSD
Serial Number:    HY22021100011
Firmware Version: U0202A0
User Capacity:    120,034,123,776 bytes [120 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
TRIM Command:     Available
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-3 T13/2161-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Jul 17 11:56:01 2023 IDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
Drive failure expected in less than 24 hours. SAVE ALL DATA.
No failed Attributes found.

General SMART Values:
Offline data collection status:  (0x02) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (  120) seconds.
Offline data collection
capabilities:                    (0x11) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        No Selective Self-test supported.
SMART capabilities:            (0x0002) Does not save SMART data before
                                        entering power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  10) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   100   100   050    Old_age   Always       -       0
  5 Reallocated_Sector_Ct   0x0032   100   100   050    Old_age   Always       -       8
  9 Power_On_Hours          0x0032   100   100   050    Old_age   Always       -       2298
 12 Power_Cycle_Count       0x0032   100   100   050    Old_age   Always       -       134
160 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       4
161 Unknown_Attribute       0x0033   100   100   050    Pre-fail  Always       -       82
163 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       13
164 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       12768
165 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       70
166 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       1
167 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       20
168 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       5050
169 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       100
175 Program_Fail_Count_Chip 0x0032   100   100   050    Old_age   Always       -       0
176 Erase_Fail_Count_Chip   0x0032   100   100   050    Old_age   Always       -       0
177 Wear_Leveling_Count     0x0032   100   100   050    Old_age   Always       -       0
178 Used_Rsvd_Blk_Cnt_Chip  0x0032   100   100   050    Old_age   Always       -       8
181 Program_Fail_Cnt_Total  0x0032   100   100   050    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   050    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   050    Old_age   Always       -       6
194 Temperature_Celsius     0x0022   100   100   050    Old_age   Always       -       40
195 Hardware_ECC_Recovered  0x0032   100   100   050    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   100   100   050    Old_age   Always       -       4
197 Current_Pending_Sector  0x0032   100   100   050    Old_age   Always       -       8
198 Offline_Uncorrectable   0x0032   100   100   050    Old_age   Always       -       4
199 UDMA_CRC_Error_Count    0x0032   100   100   050    Old_age   Always       -       0
232 Available_Reservd_Space 0x0032   100   100   050    Old_age   Always       -       82
241 Total_LBAs_Written      0x0030   100   100   050    Old_age   Offline      -       37018
242 Total_LBAs_Read         0x0030   100   100   050    Old_age   Offline      -       54166
245 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       24414

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

Selective Self-tests/Logging not supported

似乎以下行被标记

161 Unknown_Attribute       0x0033   100   100   050    Pre-fail  Always 

这是什么161 未知属性被标记的是什么?我没有看到 Hypertec 网站上列出的属性 161,但是第 31 页上还有另一个链接,其中详细介绍了属性 161。这是相同的属性,还是不同的属性

Attribute 161 - Valid Spare Block Count
Contains the remaining spare block percentage
available on a solid state device. The percentage
starts at 100% and will typically decrease to 0% dur-
ing use. If this attribute reaches 0%, the solid state
device becomes read-only. The raw value of this
attribute may contain the actual number of spare
blocks.

这款 SSD 是一款Hypertec SSD2S120FS-L此硬盘非常新,在保修期内- 我相当确定。(Ubuntu Disks 显示该磁盘仅有 3 个月零 4 天。)

标记此 161 个属性是否意味着现在可以将磁盘退回给供应商以获得保修支持?或者它到底是什么意思?

我的问题与我看到过的其他问题和答复略有不同,我需要了解有关错误报告的具体信息,以便如果问题严重到有资格获得保修支持,SSD 驱动器可能会被退回(和更换)。

我还没有看到任何相关的网络搜索来了解我正在寻找的具体内容。

问题更新

我查看了Gnome 网站可以更好地描述正在运行的测试在我看来gnome 磁盘软件可能会运行一些额外的测试,超出 SSD 或硬盘驱动器通过磁盘 BIOS 的统计信息报告的范围。

这是因为 SMART Tools 也得出了相同的结果。怎么会这样?使用扩展选项运行了哪些测试?测试是否仅来自硬盘或 SSD 上的 BIOS:

SMART TOOLS reports:

smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.19.0-46-generic] (local build)

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!

另一方面,Ubuntu gnome-disks 给出了同样的答案!

gnome-disks SMART 整体健康自我评估失败的测试结果

我尝试(使用 Ubuntu 磁盘)格式化驱动器,但驱动器没有响应。进入“格式化磁盘”菜单并选择“格式化磁盘”后,磁盘仍保留原始分区:

Ubuntu 磁盘应用程序中格式化菜单的位置

在超级用户 root 中,我发出以下命令来将磁盘 sda手动:

root@stephen-All-Series:~# dd if=/dev/zero of=/dev/sda bs=64K count=100000 status=progress
6088359936 bytes (6.1 GB, 5.7 GiB) copied, 10 s, 609 MB/s
100000+0 records in
100000+0 records out
6553600000 bytes (6.6 GB, 6.1 GiB) copied, 13.4162 s, 488 MB/s
root@stephen-All-Series:~# 

但是,这对驱动器分区没有任何影响,因为驱动器分区与驱动器擦除尝试之前完全相同使用上面的 dd 命令

由于 HDSentinel 文本,我仍然不确定该磁盘是否有资格获得保修支持

Total written: 1.14 TB
  There are 8 bad sectors on the disk surface. The contents of these sectors were moved to the spare area.
  At this point, warranty replacement of the disk is not yet possible, only if the health drops further.
  It is recommended to examine the log of the disk regularly. All new problems found will be logged there.
    No actions needed.

然而,我确信该磁盘对于普通终端客户来说毫无用处

如果有人有更多关于保修支持(包括 Hypertec 支持)所需信息的信息,我将不胜感激。我在 Hypertec 网站上没有找到检查驱动器保修状态或提交工单的链接。平心而论,Hypertec 似乎有一个联系表格

不过,我认为这对客户最有利如果有一个简单的页面字段需要填写以确定保修状态,就像戴尔支持一样

这对于最终用户来说可能更加友好。

目前 - 尽管磁盘似乎无法使用 - 我不知道是否有保修支持。这是因为硬盘哨兵报告称,目前没有保修范围,尽管磁盘损坏,无法使用 Ubuntu shell dd 命令写入设备

Health       : 85 %
Performance  : 100 %
Power on time: 95 days, 18 hours
Est. lifetime: more than 1000 days
Total written: 1.13 TB
  There are 8 bad sectors on the disk surface. The contents of these sectors were moved to the spare area.
  At this point, warranty replacement of the disk is not yet possible, only if the health drops further.
  It is recommended to examine the log of the disk regularly. All new problems found will be logged there.
    No actions needed.

至于 SMART 状态,这个参考指出“Pre-fail 和 Old_age 是错误类别,但并不表示其中任何一个都即将发生。”

它还引用了维基百科文章“自我监控、分析和报告技术”[14],其中指出:

Count of reallocated sectors. 
The raw value represents a count of the 
bad sectors that have been found and remapped.[32] 
Thus, the higher the attribute value, the more sectors
the drive has had to reallocate. 
This value is primarily used as a metric 
of the life expectancy of the drive; 
a drive which has had any reallocations at all 
is significantly more likely to fail 
in the immediate months.

因此,根据这些参考文献,此属性可能表明该驱动器可能在未来几个月内出现故障。但我同意故障不仅仅是属性值,它是“预故障”。磁盘已经发生故障,因为 dd 命令无法向其写入数据!问题在于保修范围,因为 HDSentinel 似乎表明磁盘仍在保修期内,即使我现在无法向其写入数据。

[14]:维基百科贡献者。自我监控、分析和报告技术。维基百科,自由的百科全书。2023 年 6 月 30 日,17:24 UTC。网址:https://en.wikipedia.org/w/index.php?title=Self-Monitoring,_Analysis_and_Reporting_Technology&oldid=1162701953。2023 年 7 月 19 日访问。

相关内容