我们有一个运行 Ubuntu 14.04 的 AWS EC2 实例,我负责维护它。我发现一个奇怪的现象是,我有一个 /tmp/systemd 进程使用了所有可用的 CPU:
top - 11:35:20 up 2:34, 1 user, load average: 1.13, 1.16, 1.15
Tasks: 114 total, 2 running, 111 sleeping, 0 stopped, 1 zombie
%Cpu(s): 32.7 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 67.3 st
KiB Mem: 2048516 total, 726492 used, 1322024 free, 38784 buffers
KiB Swap: 0 total, 0 used, 0 free. 439788 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1797 www-data 20 0 192900 4008 1140 S 95.2 0.2 107:42.82 /tmp/systemd
1145 mysql 20 0 681596 121328 7560 S 3.6 5.9 2:35.84 /usr/sbin/mysqld
3669 ubuntu 20 0 103084 1904 920 S 0.7 0.1 0:00.06 sshd: ubuntu@pts/0
40 root rt 0 0 0 0 S 0.3 0.0 0:04.96 [watchdog/0]
3692 ubuntu 20 0 23728 1652 1104 R 0.3 0.1 0:00.11 top
1 root 20 0 33556 2880 1480 S 0.0 0.1 0:03.20 /sbin/init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kthreadd]
我认为 /tmp/systemd 作为命令源似乎有点奇怪,因此我检查了 /tmp 目录 - 而且它完全是空的。
运行 strace 似乎表明它正在轮询某件事是否完成:
ubuntu@ip-10-0-0-157:~$ sudo strace -fvvp 1797
Process 1797 attached with 6 threads
[pid 1801] futex(0x799404, FUTEX_WAIT_PRIVATE, 4, NULL <unfinished ...>
[pid 1802] futex(0x799404, FUTEX_WAIT_PRIVATE, 4, NULL <unfinished ...>
[pid 1803] futex(0x799404, FUTEX_WAIT_PRIVATE, 4, NULL <unfinished ...>
[pid 1800] futex(0x799404, FUTEX_WAIT_PRIVATE, 4, NULL <unfinished ...>
[pid 1797] clock_gettime(CLOCK_MONOTONIC, {9448, 290556741}) = 0
[pid 1797] epoll_wait(7, <unfinished ...>
[pid 1799] sched_yield() = 0
[pid 1799] clock_gettime(CLOCK_REALTIME, {1524656311, 141773213}) = 0
[pid 1797] <... epoll_wait resumed> {}, 1024, 27) = 0
[pid 1797] clock_gettime(CLOCK_MONOTONIC, {9448, 328709201}) = 0
[pid 1797] clock_gettime(CLOCK_MONOTONIC, {9448, 328773526}) = 0
[pid 1797] epoll_wait(7, <unfinished ...>
[pid 1799] sched_yield() = 0
[pid 1799] sched_yield() = 0
[pid 1797] <... epoll_wait resumed> {}, 1024, 500) = 0
[pid 1797] clock_gettime(CLOCK_MONOTONIC, {9448, 839857928}) = 0
[pid 1797] clock_gettime(CLOCK_MONOTONIC, {9448, 839934892}) = 0
[pid 1797] epoll_wait(7, {}, 1024, 18) = 0
[pid 1797] clock_gettime(CLOCK_MONOTONIC, {9448, 860012749}) = 0
[pid 1797] clock_gettime(CLOCK_MONOTONIC, {9448, 860081346}) = 0
[pid 1797] epoll_wait(7, <unfinished ...>
[pid 1799] sched_yield() = 0
[pid 1799] sched_yield() = 0
[pid 1799] sched_yield() = 0
[pid 1799] sched_yield() = 0
[pid 1797] <... epoll_wait resumed> {}, 1024, 479) = 0
[pid 1797] clock_gettime(CLOCK_MONOTONIC, {9449, 350734613}) = 0
[pid 1797] clock_gettime(CLOCK_MONOTONIC, {9449, 350806967}) = 0
[pid 1797] epoll_wait(7, <unfinished ...>
[pid 1799] sched_yield() = 0
[pid 1799] sched_yield() = 0
[pid 1799] sched_yield() = 0
q[pid 1797] <... epoll_wait resumed> {}, 1024, 500) = 0
[pid 1797] clock_gettime(CLOCK_MONOTONIC, {9449, 905946457}) = 0
[pid 1797] clock_gettime(CLOCK_MONOTONIC, {9449, 906007520}) = 0
[pid 1797] epoll_wait(7, <unfinished ...>
[pid 1799] sched_yield() = 0
[pid 1799] sched_yield() = 0
我尝试过更新并重启服务器,但没有任何变化。我可以终止该进程,但大约 1 小时后它会重新启动。
您能否建议我如何找到这个问题的根源,或者如何找出它在做什么以及是什么启动了它?
答案1
没有 /tmp/systemd 二进制文件或临时二进制文件,鉴于启动该进程的用户是非交互式“www-data”,我认为您的网络服务器已被入侵。
检查是否有任何可疑的出站连接,使用 noexec 重新挂载 /tmp,并检查 www-data 拥有的所有内容是否正常。查找不应属于 www-data 的 shell 脚本或其他可执行文件。
看起来像 Tiny XMR mooner -https://xorl.wordpress.com/2017/12/21/the-tiny-xml-mooner-linux-cryptominer-malware/
这种情况下的最佳做法是制作机器映像(例如使用 dd)并在没有互联网访问的环境中对其进行离线分析。