前言:文件系统设置:Ubuntu 使用两个分区:
/
在我的 SSD 上(分区/dev/sda5
)/home/
/dev/sdb2
在我的硬盘(分区)上
问题(按时间顺序):
- 只读
/home/
/home/
:像往常一样使用我的 Ubuntu,我遇到了我的(在我的硬盘上)变成只读的问题 。 - 救援模式:重新启动后,Ubuntu直接进入救援模式。
- 文件系统检查:从实时 USB,我修复了家庭正在使用的 HDD 分区
fsck.ext4
。SSD 分区上没有什么可修复的。 - GUI,然后是只读:Ubuntu 再次在 GUI 上启动,但几分钟后,它又
/home/
变为“只读”。 - 循环 2-4:我已经重复了步骤 2-4 几次。每次
fsck
都能修复我的分区,我可以用 GUI 运行 Ubuntu,但经过一段随机的时间后,同样的问题又出现了。即使我什么都不做,只是每隔几分钟/home/
尝试一个文件,也会发生这种情况。touch
可能的原因:
- 硬重启(我在问题发生前 3 小时就这么做了)
- 软件包更新(我在问题发生前几十分钟执行了此操作)
软件包更新如下:
Start-Date: 2021-04-21 16:18:16
Commandline: apt-get dist-upgrade
Requested-By: xavier (1000)
Upgrade: libseccomp2:amd64 (2.4.3-1ubuntu3.18.04.3, 2.5.1-1ubuntu1~18.04.1), ruby2.5:amd64 (2.5.1-1ubuntu1.8, 2.5.1-1ubuntu1.9), libsystemd0:amd64 (237-3ubuntu10.45, 237-3ubuntu10.46), libsystemd0:i386 (237-3ubuntu10.45, 237-3ubuntu10.46), google-chrome-stable:amd64 (89.0.4389.128-1, 90.0.4430.85-1), skypeforlinux:amd64 (8.69.0.77, 8.71.0.36), udev:amd64 (237-3ubuntu10.45, 237-3ubuntu10.46), libudev1:amd64 (237-3ubuntu10.45, 237-3ubuntu10.46), libudev1:i386 (237-3ubuntu10.45, 237-3ubuntu10.46), libruby2.5:amd64 (2.5.1-1ubuntu1.8, 2.5.1-1ubuntu1.9), libcaca0:amd64 (0.99.beta19-2ubuntu0.18.04.1, 0.99.beta19-2ubuntu0.18.04.2), chromium-browser:amd64 (89.0.4389.90-0ubuntu0.18.04.2, 90.0.4430.72-0ubuntu0.18.04.1), systemd-sysv:amd64 (237-3ubuntu10.45, 237-3ubuntu10.46), chromium-codecs-ffmpeg-extra:amd64 (89.0.4389.90-0ubuntu0.18.04.2, 90.0.4430.72-0ubuntu0.18.04.1), zotero:amd64 (5.0.96, 5.0.96.2-1), libpam-systemd:amd64 (237-3ubuntu10.45, 237-3ubuntu10.46), systemd:amd64 (237-3ubuntu10.45, 237-3ubuntu10.46), libnss-systemd:amd64 (237-3ubuntu10.45, 237-3ubuntu10.46), chromium-browser-l10n:amd64 (89.0.4389.90-0ubuntu0.18.04.2, 90.0.4430.72-0ubuntu0.18.04.1)
End-Date: 2021-04-21 16:19:14
其中没有更新 linux-headers。我做了内核更新(在此登录),但那是三天前的事了,之后我一直在频繁使用我的笔记本电脑,没有遇到任何问题。
不太可能的原因:
- 硬件问题(?):同一块硬盘上有一个 NTFS 分区,而它的 Linux 分区不断损坏。我在 Windows 上使用这个 NTFS 分区来存储一些大文件。但是,我使用 Windows 的工具集检查了这个分区,没有检测到任何问题。这可能无法完全排除硬件问题,但我感觉我的硬盘并没有损坏。
我猜:我确实觉得我更新的某个软件包有问题。但是,我无法说出它们大多数的用途,因此我没有任何可以怀疑的方面。此外,我可能对更新的判断有误,但我不知道如何进一步诊断问题。如果有人知道该问题或如何进一步检查发生了什么,那将非常有帮助。
附加日志:
通用设置:
- 电脑:华硕 ROG 笔记本电脑
- Linux:Ubuntu 18.04.5 LTS,Mate 版本 1.20.1,内核 4.15.0-142-generic x86-64
更新(2021-04-23)
根据 @guiverc 的评论,我使用实时 USB 驱动器运行了长时间的磁盘自检smartctl
。以下是完整的 smartctl 日志。
据我所知(主要基于包创建者的解释),总体结论是我应该担心我的硬盘。
从积极的一面来看,SMART overall-health self-assessment test result: PASSED
这是令人鼓舞的,因为 SMART 属性的值都远高于阈值。
不太乐观的一面是,在(我认为)运行时间内或在此之前出现了 60 个错误smartctl
。如此高的数字似乎令人担忧。此外,自检日志显示状态Completed: read failure
,这也让人感到担忧。
fsck
请注意,在运行之前我还没有运行修复smartctl
,但不知道这是否重要。
根据我的理解,我觉得阅读此日志需要谨慎,备份我的硬盘(大部分已经备份)并用新的硬盘替换它。
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-42-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Toshiba 2.5" HDD MQ01ABD...
Device Model: TOSHIBA MQ01ABD100
Serial Number: 17POPDQKT
LU WWN Device Id: 5 000039 782d0abaf
Firmware Version: AX0R5J
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Form Factor: 2.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Thu Apr 22 11:53:17 2021 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 112) The previous self-test completed having
the read element of the test failed.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 236) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0
2 Throughput_Performance 0x0005 100 100 050 Pre-fail Offline - 0
3 Spin_Up_Time 0x0027 100 100 001 Pre-fail Always - 1660
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 3639
5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline - 0
9 Power_On_Hours 0x0032 084 084 000 Old_age Always - 6487
10 Spin_Retry_Count 0x0033 172 100 030 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 2859
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 630
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 85
193 Load_Cycle_Count 0x0032 098 098 000 Old_age Always - 27933
194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 32 (Min/Max 13/51)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 8
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 1
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
220 Disk_Shift 0x0002 100 100 000 Old_age Always - 0
222 Loaded_Hours 0x0032 088 088 000 Old_age Always - 5055
223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 0
224 Load_Friction 0x0022 100 100 000 Old_age Always - 0
226 Load-in_Time 0x0026 100 100 000 Old_age Always - 273
240 Head_Flying_Hours 0x0001 100 100 001 Pre-fail Offline - 0
SMART Error Log Version: 1
ATA Error Count: 60 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 60 occurred at disk power-on lifetime: 6480 hours (270 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 10 78 dd dd 40 Error: UNC at LBA = 0x00dddd78 = 14540152
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 08 10 78 dd dd 40 00 00:29:14.662 READ FPDMA QUEUED
ef 10 02 00 00 00 a0 00 00:29:14.661 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 e0 00 00:29:14.661 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 00:29:14.660 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 00 00:29:14.660 SET FEATURES [Set transfer mode]
Error 59 occurred at disk power-on lifetime: 6480 hours (270 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 d0 78 dd dd 40 Error: UNC at LBA = 0x00dddd78 = 14540152
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 08 d0 78 dd dd 40 00 00:29:10.778 READ FPDMA QUEUED
60 08 c8 70 dd dd 40 00 00:29:10.777 READ FPDMA QUEUED
60 08 c0 68 dd dd 40 00 00:29:10.765 READ FPDMA QUEUED
60 08 b8 60 dd dd 40 00 00:29:10.764 READ FPDMA QUEUED
60 08 b0 58 dd dd 40 00 00:29:10.764 READ FPDMA QUEUED
Error 58 occurred at disk power-on lifetime: 6480 hours (270 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 40 78 dd dd 40 Error: UNC at LBA = 0x00dddd78 = 14540152
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 b8 48 00 e6 dd 40 00 00:29:10.543 READ FPDMA QUEUED
60 00 40 00 dc dd 40 00 00:29:06.765 READ FPDMA QUEUED
60 00 38 00 d2 dd 40 00 00:29:06.754 READ FPDMA QUEUED
60 00 30 00 c8 dd 40 00 00:29:06.728 READ FPDMA QUEUED
60 e8 28 a0 49 29 40 00 00:29:06.716 READ FPDMA QUEUED
Error 57 occurred at disk power-on lifetime: 6480 hours (270 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 b8 78 dd dd 40 Error: UNC at LBA = 0x00dddd78 = 14540152
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 08 b8 78 dd dd 40 00 00:25:20.185 READ FPDMA QUEUED
61 08 b0 a8 6b ad 40 00 00:25:20.184 WRITE FPDMA QUEUED
ea 00 00 00 00 00 a0 00 00:25:20.184 FLUSH CACHE EXT
ef 10 02 00 00 00 a0 00 00:25:20.184 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 e0 00 00:25:20.184 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 56 occurred at disk power-on lifetime: 6480 hours (270 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 80 78 dd dd 40 Error: WP at LBA = 0x00dddd78 = 14540152
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
61 10 90 00 c8 27 40 00 00:25:19.960 WRITE FPDMA QUEUED
61 00 88 a8 6a ad 40 00 00:25:19.960 WRITE FPDMA QUEUED
60 08 80 78 dd dd 40 00 00:25:16.323 READ FPDMA QUEUED
60 d8 78 38 bc 16 40 00 00:25:16.322 READ FPDMA QUEUED
60 00 70 38 b2 16 40 00 00:25:16.310 READ FPDMA QUEUED
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 00% 6485 450747768
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
根据@paladin 的要求,这是我的/dev/fstab
。
GNU nano 4.8 fstab.txt
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda5 during installation
UUID=09c3311b-f37d-48ca-b0bf-1001574bf539 / ext4 errors=remount-ro 0 1
# /boot/efi was on /dev/sda1 during installation
UUID=6896-40E1 /boot/efi vfat umask=0077 0 1
/swapfile none swap sw 0 0
UUID=40d7f02e-01ff-4c43-80d9-4fcd8b0139a5 /home ext4 nodev,nosuid 0 2