如何查找 CPU 突然最大使用率和服务器崩溃的原因

2024-6-7 • tag-icon

场景：我有一台 4 核 CPU、8GB 内存、Ubuntu 20.04 服务器，Linode.com。最近几天，我的 linode 控制台显示 CPU 使用率统计为 400%，并向我发送超出阈值的警告电子邮件。

许多 linode 用户似乎都遇到了这个高 CPU 使用率的问题，并且在 linode 论坛和其他网站上有很多帖子，但我无法根据他们的建议和答案找出背后的原因。

根据论坛答案我运行氢能顶盖命令，顶部命令等，并得到以下结果。由于我对 Ubuntu 不太熟悉，我无法弄清楚“ ssd: root@pts/1”命令是什么以及为什么它消耗了太多的 CPU。

氢能顶盖：

另外，我无法弄清楚前 5 个进程是什么，以及它们为什么导致高 CPU 使用率。

有人能告诉我这些命令运行了什么进程吗？

信息：

由于它是测试服务器，因此任何时候都没有相当大的流量。（最多 1-2 个用户）（即使没有流量，问题仍然存在）
Apache 访问日志和错误日志没有显示任何问题。

这WHO命令返回：

abin     pts/0        2021-02-09 20:26 (xxx.xxx.xxx.xx)
abin     pts/1 2021-02-09 18:37 (xxx.xxx.xxx.xx)
abin     pts/2        2021-02-09 20:10 (xxx.xxx.xxx.xx)

/etc/crontab：

SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
17 *    * * *   root    cd / && run-parts --report /etc/cron.hourly
25 6    * * *   root    test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )
47 6    * * 7   root    test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.weekly )
52 6    1 * *   root    test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.monthly )

顶部

我努力了这论坛建议并将 swappiness 降低到较低值。但这也无济于事。

如果需要任何其他数据，请告诉我。

任何帮助都将不胜感激。提前致谢

编辑： ps -eo user=|sort|uniq -c：

7 abin
1 daemon
1 messagebus
1 mysql
114 root
1 sshd
1 syslog
1 systemd-network
1 systemd-resolve
1 systemd-timesync
355 www-data

许多进程正在“www-data”中累积

附加信息：

有太多的登录尝试记录/var/log/auth.log。
var/log/syslog。[请注意 2 月 18 日 11:41:50 之后没有日志，直到 2 月 18 日 17:12:48]

Feb 18 11:41:50 ubuntu-xxx kernel: [758663.561513] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.s>
Feb 18 11:41:50 ubuntu-xxx kernel: [758663.561525] Out of memory: Killed process 422931 (pgrep) total-vm:755444kB, anon-rss:500260kB, file-rss:2196kB, shmem-r>
Feb 18 11:41:50 ubuntu-xxx kernel: [758663.720846] oom_reaper: reaped process 422931 (pgrep), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Feb 18 17:12:48 ubuntu-xxx kernel: [758718.889366] [ 365827]    33 365827     2191     1266    57344       36             0 bash

答案1

首先，用一些有问题的虚拟 IP 替换您的真实 IP 地址。

尝试更改服务器上的默认 ssh 端口。创建/etc/ssh/sshd_config.d/01-port.conf包含以下行的文件（例如，仅使用 21412 端口，您可以使用 1024-65535 范围内的任何端口）

Port 21412

重启 sshd

systemctl restart sshd

在另一个终端上使用新的 ssh 端口测试登录服务器之前，请不要从服务器注销。

要使用新端口进行 ssh 登录，请使用

ssh -p 21412 user@yourserver

答案1

相关内容