我在专用服务器上运行带有 MPM prefork 的 Apache 2.2.21。更多详细信息:
Server Version: Apache/2.2.21 (Unix) mod_ssl/2.2.21 OpenSSL/1.0.0-fips DAV/2 SVN/1.7.0 mod_python/3.3.1 Python/2.6.5 mod_bwlimited/1.4 PHP/5.3.6
流量通常在每秒 10-30 个请求之间,有 12GB 的 RAM,并且我们对 MaxClients 进行了相当保守的调整(250)。我们确实看到由于各种原因的使用量激增(在较旧的服务器上,在这些高峰期间,我们的 MaxClients 多次达到 100)。
无论如何,这是一个新服务器。运行一段时间后,我们的 Apache 状态开始看起来像这样:
GGG_._._RC_.G..C.G_G.C_G..C_.CG_._._G__W____..R.WCR_.W..G_......
G(“优雅地完成”)卡住了。只有我重新启动 Apache 时它们才会消失。如果不定期监控/修复,这肯定会导致我们达到 MaxClients 限制。我在网上读到,Apache 中似乎有一个与此类似的错误,但发生在不同的条件下。它显然也在 2.2.14 版中得到了修复。
我包含了挂起进程的堆栈跟踪以供您检查。
#0 0x000000350c6f119e in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x000000350c67c138 in _L_lock_9164 () from /lib64/libc.so.6
#2 0x000000350c679a32 in malloc () from /lib64/libc.so.6
#3 0x000000350c66fcfb in __libc_message () from /lib64/libc.so.6
#4 0x000000350c675676 in malloc_printerr () from /lib64/libc.so.6
#5 0x000000350c675aa1 in malloc_consolidate () from /lib64/libc.so.6
#6 0x000000350c677f38 in _int_free () from /lib64/libc.so.6
#7 0x0000003906c64cbb in my_once_free () at my_once.c:117
#8 0x0000003906c5d6ff in my_end (infoflag=0) at my_init.c:170
#9 0x0000003906c5c547 in mysql_server_end () at libmysql.c:209
#10 0x00007f34ac195be8 in zm_shutdown_mysqli (type=<value optimized out>, module_number=22)
at /home/cpeasyapache/src/php-5.3.6/ext/mysqli/mysqli.c:834
#11 0x00007f34ac2b825f in module_destructor (module=0x1eafce0) at /home/cpeasyapache/src/php-5.3.6/Zend/zend_API.c:2098
#12 0x00007f34ac2be945 in zend_hash_apply_deleter (ht=0x7f34ac988aa0, p=0x1eafc80) at /home/cpeasyapache/src/php-5.3.6/Zend/zend_hash.c:614
#13 0x00007f34ac2bebd8 in zend_hash_graceful_reverse_destroy (ht=0x7f34ac988aa0) at /home/cpeasyapache/src/php-5.3.6/Zend/zend_hash.c:649
#14 0x00007f34ac2b3085 in zend_shutdown () at /home/cpeasyapache/src/php-5.3.6/Zend/zend.c:759
#15 0x00007f34ac26017a in php_module_shutdown () at /home/cpeasyapache/src/php-5.3.6/main/main.c:2146
#16 0x00007f34ac260229 in php_module_shutdown_wrapper (sapi_globals=<value optimized out>)
at /home/cpeasyapache/src/php-5.3.6/main/main.c:2118
#17 0x00007f34ac33a461 in php_apache_child_shutdown (tmp=<value optimized out>)
at /home/cpeasyapache/src/php-5.3.6/sapi/apache2handler/sapi_apache2.c:399
#18 0x00007f34ae59dea4 in run_cleanups () from /usr/local/apache/lib/libapr-1.so.0
#19 0x00007f34ae59cd72 in apr_pool_destroy () from /usr/local/apache/lib/libapr-1.so.0
#20 0x00000000004cc004 in clean_child_exit ()
#21 0x00000000004ccd00 in child_main ()
#22 0x00000000004cce62 in make_child ()
#23 0x00000000004cd107 in perform_idle_server_maintenance ()
#24 0x00000000004cd664 in ap_mpm_run ()
#25 0x000000000042e24f in main ()
问题似乎是在我们的 PHP 脚本关闭 mysqli 连接时发生的。尝试释放内存时出现挂起。有类似配置(Apache 2.2.21、PHP 5.3.6、MySQL/mysqli (5.1.56))的人遇到过类似问题吗?
有人知道我可以尝试做什么来解决这个问题吗?升级 MySQL/Apache/PHP?如果有帮助的话,我很乐意提供更多信息。
谢谢!
更新:看来 MySQL 并不是问题的关键。以下是另一个仅包含 PHP 的挂起进程的堆栈跟踪:
#0 0x000000350c6f119e in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x000000350c67c138 in _L_lock_9164 () from /lib64/libc.so.6
#2 0x000000350c679a32 in malloc () from /lib64/libc.so.6
#3 0x000000350c66fcfb in __libc_message () from /lib64/libc.so.6
#4 0x000000350c675676 in malloc_printerr () from /lib64/libc.so.6
#5 0x000000350c675aa1 in malloc_consolidate () from /lib64/libc.so.6
#6 0x000000350c677f38 in _int_free () from /lib64/libc.so.6
#7 0x00007f532accb951 in zend_mm_shutdown (heap=0x2327aa0, full_shutdown=1, silent=<value optimized out>) at /home/cpeasyapache/src/php-5.3.6/Zend/zend_alloc.c:1648
#8 0x00007f532ac951af in php_module_shutdown () at /home/cpeasyapache/src/php-5.3.6/main/main.c:2159
#9 0x00007f532ac95229 in php_module_shutdown_wrapper (sapi_globals=<value optimized out>) at /home/cpeasyapache/src/php-5.3.6/main/main.c:2118
#10 0x00007f532ad6f461 in php_apache_child_shutdown (tmp=<value optimized out>) at /home/cpeasyapache/src/php-5.3.6/sapi/apache2handler/sapi_apache2.c:399
#11 0x00007f532cfd2ea4 in run_cleanups () from /usr/local/apache/lib/libapr-1.so.0
#12 0x00007f532cfd1d72 in apr_pool_destroy () from /usr/local/apache/lib/libapr-1.so.0
#13 0x00000000004cc004 in clean_child_exit ()
#14 0x00000000004ccd00 in child_main ()
#15 0x00000000004cce62 in make_child ()
#16 0x00000000004cd107 in perform_idle_server_maintenance ()
#17 0x00000000004cd664 in ap_mpm_run ()
#18 0x000000000042e24f in main ()
更新2: 事实证明这是某些系统的一个已知问题。
我注意到,我还让处于“C”状态的 Apache 进程停留了相当长一段时间(超过 3000 秒)。我编写了一个 cron 来终止长时间处于“G”或“C”状态的进程……但这只是权宜之计。我想解决这个问题。
关于我的配置,我注意到并且改变的一件事是,我没有设置“GracefulShutdownTimeout”指令:
http://httpd.apache.org/docs/2.2/mod/mpm_common.html#gracefulshutdowntimeout
我添加了这个并将其改为 30 秒。我们看看这是否也有帮助。
更多信息: 以下是 cat /proc/PROCESS_ID/status 对其中一个“G”进程的输出:
Name: httpd
State: S (sleeping)
Tgid: 14867
Pid: 14867
PPid: 30017
TracerPid: 0
Uid: 99 99 99 99
Gid: 99 99 99 99
Utrace: 0
FDSize: 64
Groups: 99
VmPeak: 355752 kB
VmSize: 222996 kB
VmLck: 0 kB
VmHWM: 191120 kB
VmRSS: 77928 kB
VmData: 62300 kB
VmStk: 96 kB
VmExe: 1032 kB
VmLib: 24736 kB
VmPTE: 488 kB
VmSwap: 0 kB
Threads: 1
SigQ: 0/95107
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000001000
SigCgt: 000000018c0046eb
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffffffffffff
Cpus_allowed: ffffff
Cpus_allowed_list: 0-23
Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003
Mems_allowed_list: 0-1
voluntary_ctxt_switches: 24708
nonvoluntary_ctxt_switches: 2651
Apache 中加载的模块:
core_module (static)
authn_file_module (static)
authn_default_module (static)
authz_host_module (static)
authz_groupfile_module (static)
authz_user_module (static)
authz_default_module (static)
auth_basic_module (static)
include_module (static)
filter_module (static)
log_config_module (static)
logio_module (static)
mime_magic_module (static)
expires_module (static)
setenvif_module (static)
ssl_module (static)
mpm_prefork_module (static)
http_module (static)
mime_module (static)
dav_module (static)
status_module (static)
autoindex_module (static)
info_module (static)
suexec_module (static)
cgi_module (static)
dav_fs_module (static)
dav_lock_module (static)
negotiation_module (static)
dir_module (static)
actions_module (static)
userdir_module (static)
alias_module (static)
rewrite_module (static)
so_module (static)
python_module (shared)
dav_svn_module (shared)
authz_svn_module (shared)
bwlimited_module (shared)
php5_module (shared)
PHP 中加载的模块:
[PHP Modules]
bcmath
Core
ctype
curl
date
dom
eAccelerator
ereg
exif
filter
gd
gettext
hash
iconv
imap
json
libxml
mbstring
mcrypt
memcache
mysql
mysqli
openssl
pcre
PDO
pdo_mysql
pdo_sqlite
posix
Reflection
session
SimpleXML
sockets
SPL
SQLite
sqlite3
standard
tokenizer
xml
xmlreader
xmlwriter
zlib
[Zend Modules]
eAccelerator
答案1
尝试设置Keepalive Off
。我倾向于自动执行此操作,尤其是在 IO 竞争系统(如虚拟机)上。
答案2
我遵循一些一般的指导原则,如果这对您来说不具可行性,请见谅:
- 使用工人多层印刷机
- 避免
mod_<language>
- 而是使用 PHP、Python 等提供的 fcgi 接口
- 使用
mod_fcgid
如果您没有在应用程序中修改 HTTP 标头或类似内容,则您可能无需更改任何代码即可切换到 fcgi。
这样做的好处是可以更清晰地区分此设置中起作用的各种元素。这httpd
将更有效地抵御第三方模块造成的任何错误(性能方面,还有安全方面)。