因此,这种情况在运行 Plesk 10 的 RHEL5 服务器上毫无原因地开始发生。一天早上我醒来发现这台机器上托管的所有网站都处于离线状态。我通过 SSH 登录并重新启动了 httpd -
Stopping httpd: [ OK ]
Starting httpd: (98)Address already in use: make_sock: could not bind to address [::]:80
(98)Address already in use: make_sock: could not bind to address 0.0.0.0:80
no listening sockets available, shutting down
Unable to open logs
好吧,我
ps ax | grep http
kill (the pid)
service httpd start
一切都顺利启动。在不到 24 小时内,在它发生在完全相同的时间再次。在我修复它之后,它过了 13 天才崩溃再次,在同一时间。因此重申 - httpd 服务(我假设)在以下时间重新启动并失败:2012 年 4 月 27 日 04:13:52、2012 年 4 月 14 日 04:14:18、2012 年 4 月 13 日 04:12:48
我检查了我的 cron 日志,发现了以下我不理解的条目,Google 让我失望了。它们恰好发生在崩溃时,并且只发生在服务器崩溃的那几天……我想这太巧合了。
可疑/令人困惑的 cron 日志条目
[me@www httpd]# cat ../cro* | grep RELOAD
Apr 13 09:40:01 www crond[5322]: (myusername2) RELOAD (cron/myusername2)
Apr 14 04:13:01 www crond[5322]: (myusername2) RELOAD (cron/myusername2)
Apr 14 14:27:01 www crond[5322]: (myusername2) RELOAD (cron/myusername2)
[me@www httpd]# cat ../cro* | grep LIST
Apr 27 04:13:14 www crontab[12973]: (root) LIST (myusername1)
Apr 13 04:12:09 www crontab[30867]: (root) LIST (myusername2)
Apr 13 09:39:57 www crontab[8274]: (root) LIST (myusername2)
Apr 14 04:13:01 www crontab[12193]: (root) LIST (myusername2)
Apr 14 14:26:09 www crontab[27898]: (root) LIST (myusername2)
[me@www httpd]# cat ../cro* | grep REPLACE
Apr 27 04:13:14 www crontab[12974]: (root) REPLACE (myusername1)
Apr 13 04:12:09 www crontab[30868]: (root) REPLACE (myusername2)
Apr 13 09:39:57 www crontab[8275]: (root) REPLACE (myusername2)
Apr 14 04:13:01 www crontab[12194]: (root) REPLACE (myusername2)
Apr 14 14:26:09 www crontab[27899]: (root) REPLACE (myusername2)
崩溃前 5 分钟的 Cron 日志
2012 年 4 月 27 日崩溃,04:13:52
Apr 27 04:10:01 www crond[5189]: (root) CMD (/usr/share/spamassassin/sa-update.cron 2>&1 | tee -a /var/log/sa-update.log)
Apr 27 04:10:01 www crond[5192]: (root) CMD (/usr/lib/sa/sa1 1 1)
Apr 27 04:10:01 www crond[5193]: (psaadm) CMD (/usr/local/psa/admin/bin/php /opt/plesk-billing/admin/sbin/runevents.php > /dev/null 2>&1)
Apr 27 04:10:01 www crond[5195]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 27 04:10:01 www crond[5196]: (root) CMD (php /path/to/pimcore/cli/maintenance.php)
Apr 27 04:10:01 www crond[5198]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 27 04:10:01 www crond[5200]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 27 04:11:01 www crond[5711]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 27 04:12:01 www crond[6152]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 27 04:12:01 www crond[6154]: (root) CMD (lynx -dump http://www.domain.com/script
Apr 27 04:13:14 www crontab[12973]: (root) LIST (myusername1)
Apr 27 04:13:14 www crontab[12974]: (root) REPLACE (myusername1)
2012 年 4 月 14 日崩溃,04:14:18
Apr 14 04:10:01 www crond[4712]: (root) CMD (/usr/share/spamassassin/sa-update.cron 2>&1 | tee -a /var/log/sa-update.log)
Apr 14 04:10:01 www crond[4716]: (root) CMD (/usr/lib/sa/sa1 1 1)
Apr 14 04:10:01 www crond[4718]: (psaadm) CMD (/usr/local/psa/admin/bin/php /opt/plesk-billing/admin/sbin/runevents.php > /dev/null 2>&1)
Apr 14 04:10:01 www crond[4720]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 14 04:10:01 www crond[4721]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 14 04:10:01 www crond[4722]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 14 04:10:01 www crond[4724]: (root) CMD (php /path/to/pimcore)
Apr 14 04:11:01 www crond[5190]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 14 04:12:01 www crond[5543]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 14 04:12:01 www crond[5545]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 14 04:13:01 www crontab[12193]: (root) LIST (user)
Apr 14 04:13:01 www crontab[12194]: (root) REPLACE (myusername2)
Apr 14 04:13:01 www crond[5322]: (myusername2) RELOAD (cron/myusername2)
Apr 14 04:14:01 www crond[13896]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 14 04:14:01 www crond[13897]: (root) CMD (lynx -dump http://www.domain.com/script)
2012 年 4 月 13 日崩溃,04:12:48
Apr 13 04:10:01 www crond[23751]: (root) CMD (/usr/share/spamassassin/sa-update.cron 2>&1 | tee -a /var/log/sa-update.log)
Apr 13 04:10:01 www crond[23754]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 13 04:10:01 www crond[23755]: (psaadm) CMD (/usr/local/psa/admin/bin/php /opt/plesk-billing/admin/sbin/runevents.php > /dev/null 2>&1)
Apr 13 04:10:01 www crond[23756]: (root) CMD (/usr/lib/sa/sa1 1 1)
Apr 13 04:10:01 www crond[23758]: (root) CMD (php /path/to/pimcore)
Apr 13 04:10:01 www crond[23760]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 13 04:10:01 www crond[23761]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 13 04:11:01 www crond[24126]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 13 04:12:01 www crond[26995]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 13 04:12:01 www crond[26996]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 13 04:12:09 www crontab[30867]: (root) LIST (myusername2)
Apr 13 04:12:09 www crontab[30868]: (root) REPLACE (myusername2)
Apr 13 04:14:01 www crond[799]: (root) CMD (lynx -dump http://www.domain.com/script)
Apr 13 04:14:01 www crond[800]: (root) CMD (lynx -dump http://www.domain.com/script)
因此,作为一个低技术含量的临时解决方案,我每天早上将闹钟设置为凌晨 4:15,以检查我的网站是否瘫痪。请帮帮我,我为此真的睡不着觉。谢谢。
编辑1:
/var/spool/cron/myusername1 和 /var/spool/cron/myusername2
如果重要的话,两者都是空的。他们也是值得信赖的用户(以我的标准而言)。
编辑2:
刚刚注意到/var/log/messages*
Apr 27 04:13:11 www named[3541]: max open files (1024) is smaller than max sockets (4096)
Apr 27 04:13:14 www named[3541]: max open files (1024) is smaller than max sockets (4096)
Apr 13 04:12:08 www named[3541]: max open files (1024) is smaller than max sockets (4096)
Apr 13 04:12:08 www named[3541]: max open files (1024) is smaller than max sockets (4096)
Apr 14 04:12:59 www named[3541]: max open files (1024) is smaller than max sockets (4096)
Apr 14 04:13:00 www named[3541]: max open files (1024) is smaller than max sockets (4096)
我希望我是一个真正的管理员,也许我可以更好地理解这一点。不确定这是否相关,但由于这是我在日志中能找到的唯一与服务器崩溃日期/时间密切相关的其他消息,所以我将其包括在内。