Apache 中的进程挂起?

Apache 中的进程挂起?

我在专用服务器上运行带有 MPM prefork 的 Apache 2.2.21。更多详细信息:

Server Version: Apache/2.2.21 (Unix) mod_ssl/2.2.21 OpenSSL/1.0.0-fips DAV/2 SVN/1.7.0 mod_python/3.3.1 Python/2.6.5 mod_bwlimited/1.4 PHP/5.3.6

流量通常在每秒 10-30 个请求之间,有 12GB 的 RAM,并且我们对 MaxClients 进行了相当保守的调整(250)。我们确实看到由于各种原因的使用量激增(在较旧的服务器上,在这些高峰期间,我们的 MaxClients 多次达到 100)。

无论如何,这是一个新服务器。运行一段时间后,我们的 Apache 状态开始看起来像这样:

GGG_._._RC_.G..C.G_G.C_G..C_.CG_._._G__W____..R.WCR_.W..G_......

G(“优雅地完成”)卡住了。只有我重新启动 Apache 时它们才会消失。如果不定期监控/修复,这肯定会导致我们达到 MaxClients 限制。我在网上读到,Apache 中似乎有一个与此类似的错误,但发生在不同的条件下。它显然也在 2.2.14 版中得到了修复。

我包含了挂起进程的堆栈跟踪以供您检查。

#0  0x000000350c6f119e in __lll_lock_wait_private () from /lib64/libc.so.6
#1  0x000000350c67c138 in _L_lock_9164 () from /lib64/libc.so.6
#2  0x000000350c679a32 in malloc () from /lib64/libc.so.6
#3  0x000000350c66fcfb in __libc_message () from /lib64/libc.so.6
#4  0x000000350c675676 in malloc_printerr () from /lib64/libc.so.6
#5  0x000000350c675aa1 in malloc_consolidate () from /lib64/libc.so.6
#6  0x000000350c677f38 in _int_free () from /lib64/libc.so.6
#7  0x0000003906c64cbb in my_once_free () at my_once.c:117
#8  0x0000003906c5d6ff in my_end (infoflag=0) at my_init.c:170
#9  0x0000003906c5c547 in mysql_server_end () at libmysql.c:209
#10 0x00007f34ac195be8 in zm_shutdown_mysqli (type=<value optimized out>, module_number=22)
    at /home/cpeasyapache/src/php-5.3.6/ext/mysqli/mysqli.c:834
#11 0x00007f34ac2b825f in module_destructor (module=0x1eafce0) at /home/cpeasyapache/src/php-5.3.6/Zend/zend_API.c:2098
#12 0x00007f34ac2be945 in zend_hash_apply_deleter (ht=0x7f34ac988aa0, p=0x1eafc80) at /home/cpeasyapache/src/php-5.3.6/Zend/zend_hash.c:614
#13 0x00007f34ac2bebd8 in zend_hash_graceful_reverse_destroy (ht=0x7f34ac988aa0) at /home/cpeasyapache/src/php-5.3.6/Zend/zend_hash.c:649
#14 0x00007f34ac2b3085 in zend_shutdown () at /home/cpeasyapache/src/php-5.3.6/Zend/zend.c:759
#15 0x00007f34ac26017a in php_module_shutdown () at /home/cpeasyapache/src/php-5.3.6/main/main.c:2146
#16 0x00007f34ac260229 in php_module_shutdown_wrapper (sapi_globals=<value optimized out>)
    at /home/cpeasyapache/src/php-5.3.6/main/main.c:2118
#17 0x00007f34ac33a461 in php_apache_child_shutdown (tmp=<value optimized out>)
    at /home/cpeasyapache/src/php-5.3.6/sapi/apache2handler/sapi_apache2.c:399
#18 0x00007f34ae59dea4 in run_cleanups () from /usr/local/apache/lib/libapr-1.so.0
#19 0x00007f34ae59cd72 in apr_pool_destroy () from /usr/local/apache/lib/libapr-1.so.0
#20 0x00000000004cc004 in clean_child_exit ()
#21 0x00000000004ccd00 in child_main ()
#22 0x00000000004cce62 in make_child ()
#23 0x00000000004cd107 in perform_idle_server_maintenance ()
#24 0x00000000004cd664 in ap_mpm_run ()
#25 0x000000000042e24f in main ()

问题似乎是在我们的 PHP 脚本关闭 mysqli 连接时发生的。尝试释放内存时出现挂起。有类似配置(Apache 2.2.21、PHP 5.3.6、MySQL/mysqli (5.1.56))的人遇到过类似问题吗?

有人知道我可以尝试做什么来解决这个问题吗?升级 MySQL/Apache/PHP?如果有帮助的话,我很乐意提供更多信息。

谢谢!

更新:看来 MySQL 并不是问题的关键。以下是另一个仅包含 PHP 的挂起进程的堆栈跟踪:

#0  0x000000350c6f119e in __lll_lock_wait_private () from /lib64/libc.so.6
#1  0x000000350c67c138 in _L_lock_9164 () from /lib64/libc.so.6
#2  0x000000350c679a32 in malloc () from /lib64/libc.so.6
#3  0x000000350c66fcfb in __libc_message () from /lib64/libc.so.6
#4  0x000000350c675676 in malloc_printerr () from /lib64/libc.so.6
#5  0x000000350c675aa1 in malloc_consolidate () from /lib64/libc.so.6
#6  0x000000350c677f38 in _int_free () from /lib64/libc.so.6
#7  0x00007f532accb951 in zend_mm_shutdown (heap=0x2327aa0, full_shutdown=1, silent=<value optimized out>) at /home/cpeasyapache/src/php-5.3.6/Zend/zend_alloc.c:1648
#8  0x00007f532ac951af in php_module_shutdown () at /home/cpeasyapache/src/php-5.3.6/main/main.c:2159
#9  0x00007f532ac95229 in php_module_shutdown_wrapper (sapi_globals=<value optimized out>) at /home/cpeasyapache/src/php-5.3.6/main/main.c:2118
#10 0x00007f532ad6f461 in php_apache_child_shutdown (tmp=<value optimized out>) at /home/cpeasyapache/src/php-5.3.6/sapi/apache2handler/sapi_apache2.c:399
#11 0x00007f532cfd2ea4 in run_cleanups () from /usr/local/apache/lib/libapr-1.so.0
#12 0x00007f532cfd1d72 in apr_pool_destroy () from /usr/local/apache/lib/libapr-1.so.0
#13 0x00000000004cc004 in clean_child_exit ()
#14 0x00000000004ccd00 in child_main ()
#15 0x00000000004cce62 in make_child ()
#16 0x00000000004cd107 in perform_idle_server_maintenance ()
#17 0x00000000004cd664 in ap_mpm_run ()
#18 0x000000000042e24f in main ()

更新2: 事实证明这是某些系统的一个已知问题。

http://docs.cpanel.net/twiki/bin/view/EasyApache3/EA3KnownIssues#Bug:%20Apache%202.2%20Child%20Processes

我注意到,我还让处于“C”状态的 Apache 进程停留了相当长一段时间(超过 3000 秒)。我编写了一个 cron 来终止长时间处于“G”或“C”状态的进程……但这只是权宜之计。我想解决这个问题。

关于我的配置,我注意到并且改变的一件事是,我没有设置“GracefulShutdownTimeout”指令:

http://httpd.apache.org/docs/2.2/mod/mpm_common.html#gracefulshutdowntimeout

我添加了这个并将其改为 30 秒。我们看看这是否也有帮助。

更多信息: 以下是 cat /proc/PROCESS_ID/status 对其中一个“G”进程的输出:

Name:   httpd
State:  S (sleeping)
Tgid:   14867
Pid:    14867
PPid:   30017
TracerPid:      0
Uid:    99      99      99      99
Gid:    99      99      99      99
Utrace: 0
FDSize: 64
Groups: 99 
VmPeak:   355752 kB
VmSize:   222996 kB
VmLck:         0 kB
VmHWM:    191120 kB
VmRSS:     77928 kB
VmData:    62300 kB
VmStk:        96 kB
VmExe:      1032 kB
VmLib:     24736 kB
VmPTE:       488 kB
VmSwap:        0 kB
Threads:        1
SigQ:   0/95107
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000001000
SigCgt: 000000018c0046eb
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffffffffffff
Cpus_allowed:   ffffff
Cpus_allowed_list:      0-23
Mems_allowed:   00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003
Mems_allowed_list:      0-1
voluntary_ctxt_switches:        24708
nonvoluntary_ctxt_switches:     2651

Apache 中加载的模块:

 core_module (static)
 authn_file_module (static)
 authn_default_module (static)
 authz_host_module (static)
 authz_groupfile_module (static)
 authz_user_module (static)
 authz_default_module (static)
 auth_basic_module (static)
 include_module (static)
 filter_module (static)
 log_config_module (static)
 logio_module (static)
 mime_magic_module (static)
 expires_module (static)
 setenvif_module (static)
 ssl_module (static)
 mpm_prefork_module (static)
 http_module (static)
 mime_module (static)
 dav_module (static)
 status_module (static)
 autoindex_module (static)
 info_module (static)
 suexec_module (static)
 cgi_module (static)
 dav_fs_module (static)
 dav_lock_module (static)
 negotiation_module (static)
 dir_module (static)
 actions_module (static)
 userdir_module (static)
 alias_module (static)
 rewrite_module (static)
 so_module (static)
 python_module (shared)
 dav_svn_module (shared)
 authz_svn_module (shared)
 bwlimited_module (shared)
 php5_module (shared)

PHP 中加载的模块:

[PHP Modules]
bcmath
Core
ctype
curl
date
dom
eAccelerator
ereg
exif
filter
gd
gettext
hash
iconv
imap
json
libxml
mbstring
mcrypt
memcache
mysql
mysqli
openssl
pcre
PDO
pdo_mysql
pdo_sqlite
posix
Reflection
session
SimpleXML
sockets
SPL
SQLite
sqlite3
standard
tokenizer
xml
xmlreader
xmlwriter
zlib

[Zend Modules]
eAccelerator

答案1

尝试设置Keepalive Off。我倾向于自动执行此操作,尤其是在 IO 竞争系统(如虚拟机)上。

答案2

我遵循一些一般的指导原则,如果这对您来说不具可行性,请见谅:

  • 使用工人多层印刷机
  • 避免mod_<language>
  • 而是使用 PHP、Python 等提供的 fcgi 接口
  • 使用mod_fcgid

如果您没有在应用程序中修改 HTTP 标头或类似内容,则您可能无需更改任何代码即可切换到 fcgi。

这样做的好处是可以更清晰地区分此设置中起作用的各种元素。这httpd将更有效地抵御第三方模块造成的任何错误(性能方面,还有安全方面)。

相关内容