分析高负载根本原因

Question 1

通常平均负载是因为很多东西都需要 CPU，但并非总是如此。一个常见的罪魁祸首是等待 IO 的进程——磁盘或网络。

尝试运行ps -e v并查找进程状态标志。

state    The state is given by a sequence of characters, for example, "RWNA". The      first character indicates the run state of the process:
D    Marks a process in disk (or other short term, uninterruptible) wait.
I    Marks a process that is idle (sleeping for longer than about 20 seconds).  
L    Marks a process that is waiting to acquire a lock.
R    Marks a runnable process.
S    Marks a process that is sleeping for less than about 20 seconds.
T    Marks a stopped process.
W    Marks an idle interrupt thread.
Z    Marks a dead process (a "zombie").

这是来自ps联机帮助页，因此您可以在那里找到更多详细信息 -R并且D流程可能特别令人感兴趣。

Answer

通常平均负载是因为很多东西都需要 CPU，但并非总是如此。一个常见的罪魁祸首是等待 IO 的进程——磁盘或网络。

尝试运行ps -e v并查找进程状态标志。

state    The state is given by a sequence of characters, for example, "RWNA". The      first character indicates the run state of the process:
D    Marks a process in disk (or other short term, uninterruptible) wait.
I    Marks a process that is idle (sleeping for longer than about 20 seconds).  
L    Marks a process that is waiting to acquire a lock.
R    Marks a runnable process.
S    Marks a process that is sleeping for less than about 20 seconds.
T    Marks a stopped process.
W    Marks an idle interrupt thread.
Z    Marks a dead process (a "zombie").

这是来自ps联机帮助页，因此您可以在那里找到更多详细信息 -R并且D流程可能特别令人感兴趣。

Question 2

top 是我所知道的最好的工具，可以用来查找哪些进程正在使用系统资源。

在我的机器上：

$ top
top - 14:14:00 up 1 day,  2:00,  4 users,  load average: 0.24, 0.23, 0.24
Tasks: 235 total,   3 running, 232 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.2 us,  0.5 sy,  0.0 ni, 96.2 id,  0.1 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   7870416 total,  7097428 used,   772988 free,   346524 buffers
KiB Swap:  8081404 total,        0 used,  8081404 free.  3621000 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                
 4669 postgres  20   0 2037632  63924  37168 S  19.3  0.8   0:29.23 postgres                                                               
 4671 postgres  20   0 2037592  64476  38532 S  12.8  0.8   0:25.71 postgres                                                               
 4672 postgres  20   0 2037452  62992  38004 S  12.8  0.8   0:25.36 postgres                                                               
 1324 root      20   0  766268 212364 173952 S   6.4  2.7  22:34.20 Xorg                                                                   
 3804 ybounya   20   0  656468  23560  13244 S   6.4  0.3   1:44.77 gnome-terminal

所以你可以看出，当前 PID 4669 (postgres) 使用了 19.3%，4671 使用了 12.6%，依此类推。您还可以查看内存和其他关键资源。

Answer

top 是我所知道的最好的工具，可以用来查找哪些进程正在使用系统资源。

在我的机器上：

$ top
top - 14:14:00 up 1 day,  2:00,  4 users,  load average: 0.24, 0.23, 0.24
Tasks: 235 total,   3 running, 232 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.2 us,  0.5 sy,  0.0 ni, 96.2 id,  0.1 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   7870416 total,  7097428 used,   772988 free,   346524 buffers
KiB Swap:  8081404 total,        0 used,  8081404 free.  3621000 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                
 4669 postgres  20   0 2037632  63924  37168 S  19.3  0.8   0:29.23 postgres                                                               
 4671 postgres  20   0 2037592  64476  38532 S  12.8  0.8   0:25.71 postgres                                                               
 4672 postgres  20   0 2037452  62992  38004 S  12.8  0.8   0:25.36 postgres                                                               
 1324 root      20   0  766268 212364 173952 S   6.4  2.7  22:34.20 Xorg                                                                   
 3804 ybounya   20   0  656468  23560  13244 S   6.4  0.3   1:44.77 gnome-terminal

所以你可以看出，当前 PID 4669 (postgres) 使用了 19.3%，4671 使用了 12.6%，依此类推。您还可以查看内存和其他关键资源。

分析高负载根本原因

答案1

答案2

相关内容