我对 Ubuntu 还不太熟悉。我有一台联想 L430,配备 i5-3210M CPU @ 2.50GHz × 4、4GB Ram、Windows 7 双启动。硬盘 500 GB 东芝 MK5061GSY。
每 10/15 秒,HDD 灯会亮起 2-3 秒,并且一切都冻结,包括键盘输入、切换窗口等。[我在 windows7 下也出现了同样的问题,事实上,这让我尝试使用 Ubuntu 来查看这是否是 windows 的问题 - 显然不是]
这对应于以下 99.99% I/OI 观察结果:
$ sudo iotop -qtoqq
11:45:03 282 be/3 root 0.00 B/s 27.44 K/s 0.00 % 99.99 % [jbd2/sda5-8]
11:45:03 2175 be/4 simone 0.00 B/s 3.92 K/s 0.00 % 0.12 % firefox
11:45:04 2234 be/4 simone 0.00 B/s 11.65 K/s 0.00 % 0.00 % gnome-terminal
11:45:09 282 be/3 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [jbd2/sda5-8]
11:45:11 282 be/3 root 0.00 B/s 35.01 K/s 0.00 % 99.99 % [jbd2/sda5-8]
11:45:11 2234 be/4 simone 0.00 B/s 7.78 K/s 0.00 % 99.99 % gnome-terminal
11:45:11 563 be/4 syslog 0.00 B/s 3.89 K/s 0.00 % 0.00 % rsyslogd -c5
我尝试使用 Smartmontools 检查磁盘,虽然结果 - 据我所知 - 并未表明问题的可能来源,但当我启动“长”测试时
$ sudo smartctl -t long /dev/sda
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.8.0-34-generic] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
令我惊讶的是,测试过程中问题消失了!不到一分钟,>iotop 输出中 99.99% 的值就消失了,HDD 灯只会短暂闪烁 - 没问题,我终于可以用电脑工作了!
$ sudo iotop -qtoqq
11:46:42 282 be/3 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [jbd2/sda5-8]
11:46:44 282 be/3 root 0.00 B/s 0.00 B/s 0.00 % 3.84 % [jbd2/sda5-8]
11:46:45 282 be/3 root 0.00 B/s 15.63 K/s 0.00 % 99.99 % [jbd2/sda5-8]
11:46:49 2175 be/4 simone 0.00 B/s 3.91 K/s 0.00 % 5.76 % firefox
11:46:49 2200 be/4 simone 0.00 B/s 109.47 K/s 0.00 % 0.00 % firefox
11:46:49 2220 be/4 simone 0.00 B/s 62.56 K/s 0.00 % 0.00 % firefox
11:46:55 282 be/3 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [jbd2/sda5-8]
11:47:00 282 be/3 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [jbd2/sda5-8]
11:47:00 2234 be/4 simone 0.00 B/s 7.79 K/s 0.00 % 0.26 % gnome-terminal
11:47:28 282 be/3 root 0.00 B/s 43.06 K/s 0.00 % 99.99 % [jbd2/sda5-8]
11:47:28 563 be/4 syslog 0.00 B/s 3.91 K/s 0.00 % 0.00 % rsyslogd -c5
11:47:28 2175 be/4 simone 0.00 B/s 7.83 K/s 0.00 % 0.00 % firefox
11:47:29 2234 be/4 simone 0.00 B/s 7.76 K/s 0.00 % 0.00 % gnome-terminal
11:47:31 282 be/3 root 0.00 B/s 7.81 K/s 0.00 % 2.68 % [jbd2/sda5-8]
11:47:31 2175 be/4 simone 0.00 B/s 3.91 K/s 0.00 % 2.18 % firefox
11:47:32 2220 be/4 simone 0.00 B/s 62.18 K/s 0.00 % 0.00 % firefox
11:47:36 924 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.77 % [flush-8:0]>
11:47:37 282 be/3 root 0.00 B/s 31.24 K/s 0.00 % 4.88 % [jbd2/sda5-8]
11:47:37 2234 be/4 simone 0.00 B/s 7.81 K/s 0.00 % 0.00 % gnome-terminal
11:47:41 282 be/3 root 0.00 B/s 0.00 B/s 0.00 % 1.43 % [jbd2/sda5-8]
11:47:42 282 be/3 root 0.00 B/s 0.00 B/s 0.00 % 7.86 % [jbd2/sda5-8]
11:47:42 2234 be/4 simone 0.00 B/s 7.80 K/s 0.00 % 0.00 % gnome-terminal
11:47:46 282 be/3 root 0.00 B/s 3.90 K/s 0.00 % 4.88 % [jbd2/sda5-8]
11:47:47 2234 be/4 simone 0.00 B/s 7.77 K/s 0.00 % 0.00 % gnome-terminal
11:47:53 282 be/3 root 0.00 B/s 3.89 K/s 0.00 % 3.40 % [jbd2/sda5-8]
相反,不幸的是,测试结束后不久(大约 2 小时)问题再次出现 :(
创建一个在启动时启动测试然后每 2 小时启动一次测试的任务肯定不是最好的选择。有什么建议吗?提前谢谢您!
请注意,我也尝试改变 PowerManagement 行为,但没有观察到任何变化:
sudo hdparm -B128 /dev/sda *[which was the defalut value]*
sudo hdparm -B1 /dev/sda
sudo hdparm -B254 /dev/sda
sudo hdparm -B254 /dev/sda
我也没有观察到高清温度的任何影响,通常在 47C 左右
测试结果如下:
$ sudo smartctl -l selftest /dev/sda
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.8.0-34-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION ===>
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Aborted by host 90% 1859 -
# 2 Extended offline Completed without error 00% 1849 -
# 3 Extended offline Completed without error 00% 1834 -
# 4 Vendor (0x50) Aborted by host 90% 1831 -
# 5 Short offline Completed without error 00% 1831 -
# 6 Extended offline Aborted by host 10% 1830 -
# 7 Extended offline Completed without error 00% 1803 -
# 8 Extended offline Completed without error 00% 1693 -
# 9 Short offline Completed without error 00% 1690 -
#10 Short offline Completed without error 00% 1636 -
#11 Vendor (0x50) Completed without error 00% 929 -
#12 Short offline Completed without error 00% 928 -
#13 Vendor (0x50) Completed without error 00% 792 -
#14 Short offline Completed without error 00% 792 -
#15 Vendor (0x50) Completed without error 00% 791 -
#16 Short offline Completed without error 00% 791 -
#17 Short offline Aborted by host 90% 790 -
#18 Vendor (0x50) Completed without error 00% 134 -
#19 Short offline Completed without error 00% 134 -
#20 Short offline Aborted by host 80% 28 -
#21 Vendor (0x50) Completed without error 00% 0 -
我得到的sudo smartctl -d ata -a /dev/sda
是:
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.8.0-34-generic] (local
build) Copyright (C) 2002-11 by Bruce Allen,
http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION === Device Model: TOSHIBA MK5061GSY Serial Number: X2KCY12FF LU WWN Device Id: 5 000039
45d302ace Firmware Version: MC102E User Capacity: 500,107,862,016
bytes [500 GB] Sector Size: 512 bytes logical/physical Device is:
Not in smartctl database [for details use: -P showall] ATA Version is:
8 ATA Standard is: Exact ATA specification draft version not
indicated Local Time is: Thu Jan 2 10:12:51 2014 CET SMART support
is: Available - device has SMART capability. SMART support is: Enabled
=== START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED
General SMART Values: Offline data collection status: (0x82) Offline
data collection activity
was completed without error.
Auto Offline Data Collection: Enabled. Self-test execution status: ( 249) Self-test routine in progress...
90% of test remaining. Total time to complete Offline data collection: ( 120) seconds. Offline data collection capabilities:
(0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine
recommended polling time: ( 121) minutes. SCT capabilities:
(0x003f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16 Vendor Specific
SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG
VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1
Raw_Read_Error_Rate 0x000b 100 100 050 Pre-fail Always
- 0 2 Throughput_Performance 0x0005 100 100 050 Pre-fail Offline - 0 3 Spin_Up_Time 0x0027
100 100 001 Pre-fail Always - 2379 4
Start_Stop_Count 0x0032 100 100 000 Old_age Always
- 1147 5 Reallocated_Sector_Ct 0x0033 027 027 010 Pre-fail Always - 1497 7 Seek_Error_Rate 0x000b
100 100 050 Pre-fail Always - 0 8
Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline
- 0 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 1896 10 Spin_Retry_Count 0x0033
122 100 030 Pre-fail Always - 0 12
Power_Cycle_Count 0x0032 100 100 000 Old_age Always
- 1041 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 5 192 Power-Off_Retract_Count 0x0032
100 100 000 Old_age Always - 23 193
Load_Cycle_Count 0x0032 098 098 000 Old_age Always
- 23192 194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 48 (Min/Max 10/57) 196
Reallocated_Event_Count 0x0032 100 100 000 Old_age Always
- 304 197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030
100 100 000 Old_age Offline - 0 199
UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always
- 0 220 Disk_Shift 0x0002 100 100 000 Old_age Always - 128 222 Loaded_Hours 0x0032
096 096 000 Old_age Always - 1737 223
Load_Retry_Count 0x0032 100 100 000 Old_age Always
- 0 224 Load_Friction 0x0022 100 100 000 Old_age Always - 0 226 Load-in_Time 0x0026
100 100 000 Old_age Always - 297 240
Head_Flying_Hours 0x0001 100 100 001 Pre-fail Offline
- 0
SMART Error Log Version: 1 No Errors Logged
SMART Self-test log structure revision number 1 Num Test_Description
Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Aborted by host 10% 1887 -
# 2 Extended offline Aborted by host 90% 1864 -
# 3 Extended offline Interrupted (host reset) 70% 1863 -
# 4 Extended offline Aborted by host 50% 1862 -
# 5 Extended offline Aborted by host 50% 1861 -
# 6 Extended offline Aborted by host 90% 1860 -
# 7 Extended offline Aborted by host 90% 1859 -
# 8 Extended offline Completed without error 00% 1849 -
# 9 Extended offline Completed without error 00% 1834 -
#10 Vendor (0x50) Aborted by host 90% 1831 -
#11 Short offline Completed without error 00% 1831 -
#12 Extended offline Aborted by host 10% 1830 -
#13 Extended offline Completed without error 00% 1803 -
#14 Extended offline Completed without error 00% 1693 -
#15 Short offline Completed without error 00% 1690 -
#16 Short offline Completed without error 00% 1636 -
#17 Vendor (0x50) Completed without error 00% 929 -
#18 Short offline Completed without error 00% 928 -
#19 Vendor (0x50) Completed without error 00% 792 -
#20 Short offline Completed without error 00% 792 -
#21 Vendor (0x50) Completed without error 00% 791 -
SMART Selective self-test log data structure revision number 1 SPAN
MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If
Selective self-test is pending on power-up, resume after 0 minute
delay.
也许这些测试能揭示一些有用的信息?谢谢!