我的服务器当前运行的是 CentOS 5.2 和 WHM 11.34。
目前,我们的平均负载为 6.43 到 12。我们托管的网站需要很长时间才能响应和解决问题。 top
没有显示任何异常,也iftop
没有显示大量流量。
我们有很多经销商,有些人不太擅长编写代码,我们如何才能找到罪魁祸首?
vmstat 输出:
vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 2 84 78684 154916 1021080 0 0 72 274 0 14 6 3 80 12 0
顶部输出(按 %CPU 排序)
top - 21:44:43 up 5 days, 10:39, 3 users, load average: 3.36, 4.18, 4.73
Tasks: 222 total, 3 running, 219 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.8%us, 2.3%sy, 0.2%ni, 79.6%id, 11.8%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 2074580k total, 1863044k used, 211536k free, 174828k buffers
Swap: 2040212k total, 84k used, 2040128k free, 987604k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15930 mysql 15 0 138m 46m 4380 S 4 2.3 1:45.87 mysqld
21772 igniteth 17 0 23200 7152 3932 R 4 0.3 0:00.02 php
1586 root 10 -5 0 0 0 S 2 0.0 11:45.19 kjournald
21759 root 15 0 2416 1024 732 R 2 0.0 0:00.01 top
1 root 15 0 2156 648 560 S 0 0.0 0:26.31 init
2 root RT 0 0 0 0 S 0 0.0 0:00.35 migration/0
3 root 34 19 0 0 0 S 0 0.0 0:00.32 ksoftirqd/0
4 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/0
5 root RT 0 0 0 0 S 0 0.0 0:02.00 migration/1
6 root 34 19 0 0 0 S 0 0.0 0:00.11 ksoftirqd/1
7 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/1
8 root RT 0 0 0 0 S 0 0.0 0:01.29 migration/2
9 root 34 19 0 0 0 S 0 0.0 0:00.26 ksoftirqd/2
10 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/2
11 root RT 0 0 0 0 S 0 0.0 0:00.90 migration/3
12 root 34 19 0 0 0 R 0 0.0 0:00.20 ksoftirqd/3
13 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/3
顶部输出(按 CPU 时间排序)
top - 21:46:12 up 5 days, 10:41, 3 users, load average: 2.88, 3.82, 4.55
Tasks: 217 total, 1 running, 216 sleeping, 0 stopped, 0 zombie
Cpu(s): 3.7%us, 2.0%sy, 2.0%ni, 67.2%id, 25.0%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 2074580k total, 1959516k used, 115064k free, 183116k buffers
Swap: 2040212k total, 84k used, 2040128k free, 1090308k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ TIME COMMAND
32367 root 16 0 215m 212m 1548 S 0 10.5 62:03.63 62:03 tailwatchd
1586 root 10 -5 0 0 0 S 0 0.0 11:45.27 11:45 kjournald
1576 root 10 -5 0 0 0 S 0 0.0 2:37.86 2:37 kjournald
27722 root 16 0 2556 1184 800 S 0 0.1 1:48.94 1:48 top
15930 mysql 15 0 138m 46m 4380 S 4 2.3 1:48.63 1:48 mysqld
2932 root 34 19 0 0 0 S 0 0.0 1:41.05 1:41 kipmi0
226 root 10 -5 0 0 0 S 0 0.0 1:34.33 1:34 kswapd0
2671 named 25 0 74688 7400 2116 S 0 0.4 1:23.58 1:23 named
3229 root 15 0 10300 3348 2724 S 0 0.2 0:40.85 0:40 sshd
1580 root 10 -5 0 0 0 S 0 0.0 0:30.62 0:30 kjournald
1 root 17 0 2156 648 560 S 0 0.0 0:26.32 0:26 init
2616 root 15 0 1816 576 480 S 0 0.0 0:23.50 0:23 syslogd
1584 root 10 -5 0 0 0 S 0 0.0 0:18.67 0:18 kjournald
4342 root 34 19 27692 11m 2116 S 0 0.5 0:18.23 0:18 yum-updatesd
8044 bollingp 15 0 3456 2036 740 S 1 0.1 0:15.56 0:15 imapd
26 root 10 -5 0 0 0 S 0 0.0 0:14.18 0:14 kblockd/1
7989 gmailsit 16 0 3196 1748 736 S 0 0.1 0:10.43 0:10 imapd
iostat -xtk 1 10 输出
[root@server1 tmp]# iostat -xtk 1 10
Linux 2.6.18-53.el5 12/18/2012
Time: 09:51:06 PM
avg-cpu: %user %nice %system %iowait %steal %idle
5.83 0.19 2.53 11.85 0.00 79.60
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 1.37 118.83 18.70 54.27 131.47 692.72 22.59 4.90 67.19 3.10 22.59
sdb 0.35 39.33 20.33 61.43 158.79 403.22 13.75 5.23 63.93 3.77 30.80
Time: 09:51:07 PM
avg-cpu: %user %nice %system %iowait %steal %idle
1.50 0.00 0.50 24.00 0.00 74.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 25.00 2.00 2.00 128.00 108.00 118.00 0.03 7.25 4.00 1.60
sdb 0.00 16.00 41.00 145.00 200.00 668.00 9.33 107.92 272.72 5.38 100.10
Time: 09:51:08 PM
avg-cpu: %user %nice %system %iowait %steal %idle
2.00 0.00 1.50 29.50 0.00 67.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 95.00 3.00 33.00 12.00 480.00 27.33 0.07 1.72 1.31 4.70
sdb 0.00 14.00 1.00 228.00 4.00 960.00 8.42 143.49 568.01 4.37 100.10
Time: 09:51:09 PM
avg-cpu: %user %nice %system %iowait %steal %idle
13.28 0.00 2.76 21.30 0.00 62.66
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 21.00 1.00 19.00 16.00 192.00 20.80 0.06 3.55 1.30 2.60
sdb 0.00 36.00 28.00 181.00 124.00 884.00 9.65 121.16 617.31 4.79 100.10
Time: 09:51:10 PM
avg-cpu: %user %nice %system %iowait %steal %idle
4.74 0.00 1.50 25.19 0.00 68.58
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 20.00 3.00 15.00 12.00 136.00 16.44 0.17 7.11 3.11 5.60
sdb 0.00 0.00 103.00 60.00 544.00 248.00 9.72 52.35 545.23 6.14 100.10
Time: 09:51:11 PM
avg-cpu: %user %nice %system %iowait %steal %idle
1.24 0.00 1.24 25.31 0.00 72.21
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 75.00 4.00 28.00 16.00 416.00 27.00 0.08 3.72 2.03 6.50
sdb 2.00 9.00 124.00 17.00 616.00 104.00 10.21 3.73 213.73 7.10 100.10
Time: 09:51:12 PM
avg-cpu: %user %nice %system %iowait %steal %idle
1.00 0.00 0.75 24.31 0.00 73.93
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 24.00 1.00 9.00 4.00 132.00 27.20 0.01 1.20 1.10 1.10
sdb 4.00 40.00 103.00 48.00 528.00 212.00 9.80 105.21 104.32 6.64 100.20
Time: 09:51:13 PM
avg-cpu: %user %nice %system %iowait %steal %idle
2.50 0.00 1.75 23.25 0.00 72.50
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 125.74 3.96 46.53 15.84 689.11 27.92 0.20 4.06 2.41 12.18
sdb 2.97 0.00 91.09 84.16 419.80 471.29 10.17 85.85 590.78 5.66 99.11
Time: 09:51:14 PM
avg-cpu: %user %nice %system %iowait %steal %idle
0.75 0.00 0.50 24.94 0.00 73.82
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 88.00 1.00 7.00 4.00 380.00 96.00 0.04 4.38 3.00 2.40
sdb 3.00 7.00 111.00 44.00 540.00 208.00 9.65 18.58 581.79 6.46 100.10
Time: 09:51:15 PM
avg-cpu: %user %nice %system %iowait %steal %idle
11.03 0.00 3.26 26.57 0.00 59.15
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 145.00 7.00 53.00 28.00 792.00 27.33 0.15 2.50 1.55 9.30
sdb 1.00 0.00 155.00 0.00 800.00 0.00 10.32 2.85 18.63 6.46 100.10
[root@server1 tmp]#
MySQL 显示完整进程列表
mysql> show full processlist;
+------+---------------+-----------+-----------------------+----------------+------+----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+---------------+-----------+-----------------------+----------------+------+----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 1 | DB_USER_ONE | localhost | DB_ONE | Query | 3 | waiting for handler insert | INSERT DELAYED INTO defers (mailtime,msgid,email,transport_method,message,host,ip,router,deliveryuser,deliverydomain) VALUES(FROM_UNIXTIME('1355879748'),'1TivwL-0003y8-8l','[email protected]','remote_smtp','SMTP error from remote mail server after initial connection: host mx1.mail.tw.yahoo.com [203.188.197.119]: 421 4.7.0 [TS01] Messages from 75.125.90.146 temporarily deferred due to user complaints - 4.16.55.1; see http://postmaster.yahoo.com/421-ts01.html','mx1.mail.tw.yahoo.com','203.188.197.119','lookuphost','','') |
| 2 | DELAYED | localhost | DB_ONE | Delayed insert | 52 | insert | |
| 3 | DELAYED | localhost | DB_ONE | Delayed insert | 68 | insert | |
| 911 | DELAYED | localhost | DB_ONE | Delayed insert | 99 | Waiting for INSERT | |
| 993 | DB_USER_TWO | localhost | DB_TWO | Sleep | 832 | | NULL |
| 994 | DB_USER_ONE | localhost | DB_ONE | Query | 185 | Locked | delete from failures where FROM_UNIXTIME(UNIX_TIMESTAMP(NOW())-1296000) > mailtime |
| 1102 | DB_USER_THREE | localhost | DB_THREE | Query | 29 | NULL | commit |
| 1249 | DB_USER_FOUR | localhost | DB_FOUR | Query | 13 | NULL | commit |
| 1263 | root | localhost | DB_FIVE | Query | 0 | NULL | show full processlist |
| 1264 | DB_USER_SIX | localhost | DB_SIX | Query | 3 | NULL | commit |
+------+---------------+-----------+-----------------------+----------------+------+----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
10 rows in set (0.00 sec)
答案1
很明显,您的磁盘已达到极限。通常,%wa (iowait) 应该非常低(对于一般网站来说,<1%),并且您希望 util%(来自 iostat -x)尽可能低(可能为 0)。
您可以使用 iotop 来找出导致所有磁盘使用量的进程。
如果是 mysql 的问题,你应该在 my.cnf 中打开“记录慢查询”(并重启 mysql)。然后你就能找出导致该问题的具体查询。
或者。我认为您的 sdb 已损坏。尝试检查硬件。
编辑:iotop(可通过 EPEL 获得)是一个很棒的工具,它可以让您知道哪个进程导致 iowait。
答案2
您的 sdb 表现异常。要么是磁盘驱动器坏了。如果您网站上的流量模式相同,并且这是一个新问题,那么有足够的证据表明您需要更换 sdb。
Linux 中任何 IO 的路径上都有两个队列。一个是 IO 调度程序队列,由 控制nr_requests
,另一个是硬件内部的队列。IO 的合并发生在调度程序层。因此,当您看到 avgqu-sz 较小(即平均队列大小较小,而 await 较大且 svctm 较小)时,这意味着存储需要时间来处理这些 IO 请求。
意思是,本质上存储速度很慢或者存储质量很差。
%util 显示在 1000 毫秒内完成 IO 需要多少毫秒。这个值越大,磁盘负载就越大。这并不意味着磁盘负载很重,但就你的情况来说,磁盘很慢,相当慢。