我在 Joyent 上运行 SmartMachine。我相信这是运行 Solaris 的某种虚拟机。我们在此机器上运行 Apache、PHP 和 MySQL 的 Web 应用程序。它可以很好地处理我们中等数量的流量。但是,自从我们上线以来,每天晚上。该网站都会开始返回 403 Forbidden 错误,直到 Apache 重新启动。我快速查看了 Apache 的错误日志,发现了以下内容:
[Tue Oct 26 23:13:00 2010] [error] server reached MaxClients setting, consider raising the MaxClients setting
[Wed Oct 27 13:09:40 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file `/var/run/ssl_scache' for reading (fetch)
[Wed Oct 27 13:09:40 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file `/var/run/ssl_scache' for writing (store)
[Wed Oct 27 13:09:40 2010] [error] [client 98.25.133.36] PHP Fatal error: Unknown: Failed opening required '/home/jill/web/content/index.php' (include_path='.:/opt/local/lib/php') in Unknown on line 0, referer: https://[redacted]/presentations/present#
[Wed Oct 27 13:09:42 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file `/var/run/ssl_scache' for reading (fetch)
[Wed Oct 27 13:09:42 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file `/var/run/ssl_scache' for writing (store)
[Wed Oct 27 13:09:42 2010] [crit] [client 68.193.4.75] (24)Too many open files: /home/jill/web/content/.htaccess pcfg_openfile: unable to check htaccess file, ensure it is readable, referer: https://[redacted]/presentations/present#
[Wed Oct 27 13:09:43 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file `/var/run/ssl_scache' for reading (fetch)
[Wed Oct 27 13:09:43 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file `/var/run/ssl_scache' for writing (store)
[Wed Oct 27 13:09:43 2010] [crit] [client 72.28.224.201] (24)Too many open files: /home/jill/web/content/.htaccess pcfg_openfile: unable to check htaccess file, ensure it is readable, referer: https://[redacted]/presentations/present#
[Wed Oct 27 13:09:44 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file `/var/run/ssl_scache' for reading (fetch)
[Wed Oct 27 13:09:44 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file `/var/run/ssl_scache' for writing (store)
[Wed Oct 27 13:09:44 2010] [crit] [client 72.28.224.201] (24)Too many open files: /home/jill/web/content/.htaccess pcfg_openfile: unable to check htaccess file, ensure it is readable, referer: https://[redacted]/presentations/present#
最后三行代码对每个向服务器发出的请求都重复执行。我真的不知道如何防止这种情况发生。我尝试使用 prctl 增加可以打开的文件数量,但我肯定使用方法不正确,因为当我尝试将其设置为 65.5K 时,prctl 返回的基本文件数量为 1.02K。我甚至不确定这是否是一个合理的解决方案:
prctl -i process -n process.max-file-descriptor `pgrep httpd`
process: 18284: /opt/local/sbin/httpd -k start
NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT
process.max-file-descriptor
basic 1.02K - deny 18284
privileged 65.5K - deny -
system 2.15G max deny -
process: 18285: /opt/local/sbin/httpd -k start
NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT
process.max-file-descriptor
basic 1.02K - deny 18285
privileged 65.5K - deny -
system 2.15G max deny -
那么追踪和解决此类问题的最佳方法是什么?
更新
这是根 httpd 进程的 pfiles 输出。
[root@fe5txrad ~]# pfiles 18269
18269: /opt/local/sbin/httpd -k start
Current rlimit: 1024 file descriptors
0: S_IFCHR mode:0666 dev:304,8 ino:3020727013 uid:0 gid:3 rdev:13,2
O_RDONLY
/dev/null
1: S_IFCHR mode:0666 dev:304,8 ino:3020727013 uid:0 gid:3 rdev:13,2
O_WRONLY|O_CREAT|O_TRUNC
/dev/null
2: S_IFREG mode:0640 dev:182,65550 ino:362926 uid:0 gid:0 size:20551848
O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE
/var/log/httpd/error.log
3: S_IFDOOR mode:0444 dev:313,0 ino:38 uid:0 gid:0 size:0
O_RDONLY|O_LARGEFILE FD_CLOEXEC door to nscd[18176]
/var/run/name_service_door
4: S_IFSOCK mode:0666 dev:311,0 ino:43693 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK FD_CLOEXEC
SOCK_STREAM
SO_REUSEADDR,SO_KEEPALIVE,SO_SNDBUF(49152),SO_RCVBUF(49152)
sockname: AF_INET 0.0.0.0 port: 80
5: S_IFSOCK mode:0666 dev:311,0 ino:42512 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK FD_CLOEXEC
SOCK_STREAM
SO_REUSEADDR,SO_KEEPALIVE,SO_SNDBUF(49152),SO_RCVBUF(49152)
sockname: AF_INET 0.0.0.0 port: 443
6: S_IFIFO mode:0000 dev:301,0 ino:8763127 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK FD_CLOEXEC
7: S_IFIFO mode:0000 dev:301,0 ino:8763127 uid:0 gid:0 size:0
O_RDWR FD_CLOEXEC
8: S_IFREG mode:0640 dev:182,65550 ino:362927 uid:0 gid:0 size:1450493
O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE FD_CLOEXEC
/var/log/httpd/access.log
9: S_IFREG mode:0644 dev:182,65550 ino:369102 uid:1000 gid:1000 size:528239971
O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE FD_CLOEXEC
/home/jill/logs/access_log
10: S_IFREG mode:0644 dev:182,65550 ino:369102 uid:1000 gid:1000 size:528239971
O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE FD_CLOEXEC
/home/jill/logs/access_log
11: S_IFREG mode:0644 dev:308,39 ino:3386326219 uid:0 gid:0 size:0
O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE FD_CLOEXEC
12: S_IFREG mode:0644 dev:308,39 ino:3088492558 uid:0 gid:0 size:0
O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE FD_CLOEXEC
advisory write lock set by process 7350
13: S_IFSOCK mode:0666 dev:311,0 ino:6452 uid:0 gid:0 size:0
O_RDWR FD_CLOEXEC
SOCK_STREAM
SO_SNDBUF(16384),SO_RCVBUF(5120)
sockname: AF_UNIX
答案1
当您开始收到这些错误时,您并没有提到最多有多少个连接。
您可以使用如下工具/命令:
ps -ef | grep apache | wc -l
lsof -p <apache pid>
netstat -anp | grep 80 | grep -i ESTABLISHED | wc -l
第一个命令告诉你系统上有多少个 apache 进程。第二个命令告诉你 apache 进程有多少个打开的文件/连接。当然,你需要用实际值替换。第三个命令告诉你有多少个已建立的连接。
答案2
我曾经被类似的东西搞得焦头烂额。另外,你可能还想检查一下 PHP 是否也没有占用文件。如果 PHP 将会话信息存储在磁盘上,突然的请求过载也会占用所有文件。
答案3
如果你有良好的内存和 CPU,你可以通过 ulimit 命令增加打开文件数限制
ulimit -n 32667(也可能高于此)
答案4
如果你有良好的内存和 CPU,你可以通过 ulimit 命令增加打开文件数限制
ulimit -n 32667(也可能高于此)