我有一台托管在 AWS 上的 Ubuntu 12.04 服务器,它使用 Nginx 和 php5-fpm 为十几个 WordPress 安装提供支持。
问题就是在某一天开始的。我的服务器宕机了。太恐慌了!每当我再次打开它时,几分钟后它总是会再次死机。Postfix 占用了 100% 的 CPU,直到服务器无法继续运行。我很震惊地发现有什么东西以不人道的速度发送电子邮件。
这是我的 postfix 配置(/etc/postfix/main.cf
):
smtpd_banner = $myhostname ESMTP $mail_name (Ubuntu)
biff = no
# appending .domain is the MUA's job.
append_dot_mydomain = no
# Uncomment the next line to generate "delayed mail" warnings
#delay_warning_time = 4h
readme_directory = no
# TLS parameters
smtpd_tls_cert_file=/etc/ssl/certs/ssl-cert-snakeoil.pem
smtpd_tls_key_file=/etc/ssl/private/ssl-cert-snakeoil.key
smtpd_use_tls=yes
smtpd_tls_session_cache_database = btree:${data_directory}/smtpd_scache
smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache
# See /usr/share/doc/postfix/TLS_README.gz in the postfix-doc package for
# information on enabling SSL in the smtp client.
myhostname = ip-172-31-3-94.eu-west-1.compute.internal alias_maps = hash:/etc/aliases alias_database = hash:/etc/aliases myorigin = /etc/mailname mydestination = creatroopers.com., ip-172-31-3-94.eu-west-1.compute.internal, localhost.eu-west-1.compute.internal, localhost relayhost = mynetworks
= 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128 mailbox_size_limit = 0 recipient_delimiter = + inet_interfaces = all
这是一个一秒来自以下日志的数据/var/log/mail.log
(creatroopers.com 是我的服务器):
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[14534]: 0092B47C77: message-id= <20140317111547.0092B47C77@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 0092B47C77: from=<[email protected]>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12531]: connect to mailin-02.mx.aol.com[64.12.88.164]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12534]: connect to mx.hover.com.cust.hostedemail.com[216.40.42.4]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12535]: connect to usamail.com[209.15.13.134]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12534]: A9E6947A48: to=<[email protected]>, relay=none, delay=151, delays=121/0.51/30/0, dsn=4.4.1, status=deferred (connect to mx.hover.com.cust.hostedemail.com[216.40.42.4]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12535]: F223B47A8D: to=<[email protected]>, relay=none, delay=149, delays=119/0.55/30/0, dsn=4.4.1, status=deferred (connect to usamail.com[209.15.13.134]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12545]: connect to sakha.net[146.185.211.229]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12545]: D4CDE47A63: to=<[email protected]>, relay=none, delay=148, delays=118/0.6/30/0, dsn=4.4.1, status=deferred (connect to sakha.net[146.185.211.229]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12543]: connect to smtp-in.sto-hy.se.stejtech.net[79.136.102.100]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 4862B48363: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[15130]: 4862B48363: message-id=<20140317111547.4862B48363@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12544]: connect to mxs.mail.ru[94.100.180.150]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12533]: connect to ebmail.com[72.52.4.90]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12546]: connect to searchmail.com[72.52.4.90]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 4862B48363: from=<[email protected]>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 5CAC44829F: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[14534]: 5CAC44829F: [email protected]
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12546]: 2F1FD47A5F: to=<[email protected]>, relay=none, delay=150, delays=119/0.63/30/0, dsn=4.4.1, status=deferred (connect to searchmail.com[72.52.4.90]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12533]: B5E6747AE6: to=<[email protected]>, relay=none, delay=153, delays=122/0.5/30/0, dsn=4.4.1, status=deferred (connect to ebmail.com[72.52.4.90]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 5FDC647D02: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[15130]: 5FDC647D02: message-id=<20140317111547.5FDC647D02@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 5FDC647D02: from=<[email protected]>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 5CAC44829F: from=<[email protected]>, size=2108, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12548]: connect to mx01.gmx.com[74.208.5.27]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12547]: connect to mxs.mail.ru[94.100.180.150]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12549]: connect to relay.verizon.net[206.46.232.11]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12561]: connect to mxs.mail.ru[94.100.180.150]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12549]: 8F26947A53: to=<[email protected]>, relay=none, delay=149, delays=118/0.79/30/0, dsn=4.4.1, status=deferred (connect to relay.verizon.net[206.46.232.11]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12534]: connect to ASPMX.L.GOOGLE.COM[2a00:1450:400c:c05::1b]:25: Network is unreachable
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 8206C4811D: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[14534]: 8206C4811D: message-id=<20140317111547.8206C4811D@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 8206C4811D: from=<[email protected]>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 8BAFB47E02: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[15130]: 8BAFB47E02: message-id=<20140317111547.8BAFB47E02@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 8E05B48185: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[14534]: 8E05B48185: message-id=<20140317111547.8E05B48185@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 8E05B48185: from=<[email protected]>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 8BAFB47E02: from=<[email protected]>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 905C1480E6: uid=33 from=<www-data>
看到了吗?太可怕了。如果我重新初始化服务器并运行service postfix stop
,在日志中我得到的Mar 18 09:49:10 ip-172-31-3-94 postfix/master[25358]: terminating on signal 15
服务器又恢复正常了。但不知何故,每天早上 8:55,postfix 都会自行唤醒并再次关闭服务器。
我看到启动所有这些操作的用户是“www-data”,因此运行后ps auxww|grep ^www-data
我们会得到:
www-data 870 0.0 0.1 78208 3356 ? S 09:41 0:02 nginx: worker process
www-data 871 0.0 0.1 78208 3356 ? S 09:41 0:03 nginx: worker process
www-data 872 0.0 0.1 78208 3356 ? S 09:41 0:02 nginx: worker process
www-data 875 0.0 0.1 78208 3356 ? S 09:41 0:02 nginx: worker process
www-data 8066 0.2 0.0 25096 1464 ? S 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8343 1.0 0.0 25088 1408 ? D 16:10 0:00 /usr/sbin/postdrop -r
www-data 8380 1.2 0.0 25096 1464 ? R 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8616 0.0 0.2 263332 4924 ? S 16:10 0:00 php-fpm: pool www
www-data 8617 0.0 0.0 25096 1464 ? S 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8638 0.0 0.0 25096 1464 ? S 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8640 0.0 0.0 25096 1464 ? S 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8649 2.0 0.0 18816 1044 ? R 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8654 2.0 0.0 20920 1320 ? R 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8664 2.0 0.0 2632 156 ? R 16:10 0:00 /usr/sbin/postdrop -r
www-data 8673 2.0 0.0 25092 1404 ? R 16:10 0:00 /usr/sbin/postdrop -r
www-data 8681 2.0 0.0 18696 520 ? R 16:10 0:00 /usr/sbin/postdrop -r
www-data 8682 0.0 0.0 25096 1468 ? S 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8713 1.0 0.0 18704 520 ? R 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8720 0.0 0.0 25088 1408 ? S 16:10 0:00 /usr/sbin/postdrop -r
www-data 19310 0.5 1.6 271260 28300 ? S 16:05 0:01 php-fpm: pool www
www-data 22166 0.5 1.7 272196 30404 ? S 16:05 0:01 php-fpm: pool www
www-data 22240 0.6 1.1 270468 18880 ? S 16:05 0:01 php-fpm: pool www
当我运行 时postdrop -d ALL
,两分钟后它会在屏幕上打印“Killed”,退出系统并强制我重新启动服务器。
不用说,我无法通过任何网站发送电子邮件,因为即使在 postfix 运行的 10 分钟内,电子邮件队列也已满,无法添加任何内容。
我需要帮助来查明这种无休止的垃圾邮件流的原因。
谢谢你!
答案1
答案比我想象的要简单得多。有人入侵了服务器,并在 www-data 用户的 crontab 中添加了 5000 行相同的代码,执行了 sendmail。
我所要做的就是删除它们。
Pikkaachu 的回答也有所帮助。邮件停止发送,但 CPU 仍然为 100%。
答案2
听起来你有一个开放的中继。
把这行放在你的main.cf
smtpd_client_restrictions = permit_mynetworks, reject
这将停止你的 postfix 的开放中继。
smtpd_recipient_restrictions
还要检查不是 permit
希望这可以帮助。
另外,更新所有 wordpress 安装,获取一些安全插件来监控其行为,如果不保持更新,wordpress 很容易受到攻击。