httpd 进程 CPU 占用高

httpd 进程 CPU 占用高

我目前在一台服务器上运行几个流量很低的网站,但 CPU 占用率很高。其中一个网站仍在开发中,很快就会上线。但是,这个网站的速度非常非常慢……浏览其页面时,我可以看到 httpd 的 CPU 占用率从 30% 上升到 100%(见下面的顶部输出)。

我已经调整了 httpd 和 MySQL、Apache Solr、Tomcat 以获得高性能,并且我正在使用 APC。

不确定接下来要做什么或如何找到罪魁祸首,因为我在 httpd 日志上有很多消息,并且已经追逐死胡同一段时间了... 任何帮助都将不胜感激。

服务器: 正品 AMD,四核 AMD Opteron(tm) 处理器 2352,RAM 16GB

Linux 2.6.27 64 位、Centos 5.5

Plesk 9.5.4、MySQL 5.1.48、PHP 5.2.17

Apache/2.2.3(CentOS)DAV/2 mod_jk/1.2.15 mod_ssl/2.2.3 OpenSSL/0.9.8e-fips-rhel5 PHP/5.2.17 mod_perl/2.0.4 Perl/v5.8.8

Tomcat6-6.0.29-1.jpp5、Tomcat-native-1.1.20-1.el5、Apache Solr

顶部

17595 apache    20   0 1825m 507m  10m R 100.4  3.2   0:17.50 httpd
17596 apache    20   0 1565m 247m 9936 R 83.1  1.5   0:10.86 httpd
17598 apache    20   0 1430m 110m 6472 S 54.5  0.7   0:08.66 httpd
17599 apache    20   0 1438m 124m  12m S 37.2  0.8   0:11.20 httpd
16197 mysql     20   0 13.0g 2.0g 5440 S  9.6 12.6 297:12.79 mysqld
17617 root      20   0 12748 1172  812 R  0.7  0.0   0:00.88 top
8169 tomcat    20   0 4613m 268m 6056 S  0.3  1.7   6:40.56 java

httpd 错误日志

[debug] prefork.c(991): AcceptMutex: sysvsem (default: sysvsem)
[info] mod_fcgid: Process manager 17593 started
[debug] proxy_util.c(1854): proxy: grabbed scoreboard slot 0 in child 17594 for worker proxy:reverse
[debug] proxy_util.c(1967): proxy: initialized single connection worker 0 in child 17594 for (*)
[debug] proxy_util.c(1854): proxy: grabbed scoreboard slot 0 in child 17595 for worker proxy:reverse
[debug] proxy_util.c(1873): proxy: worker proxy:reverse already initialized

[notice] child pid 22782 exit signal Segmentation fault (11)

[error] (43)Identifier removed: apr_global_mutex_lock(jk_log_lock) failed
[debug] util_ldap.c(2021): LDAP merging Shared Cache conf: shm=0x7fd29a5478c0 rmm=0x7fd29a547918 for VHOST: example.com
[info] APR LDAP: Built with OpenLDAP LDAP SDK
[info] LDAP: SSL support available
[info] Init: Seeding PRNG with 256 bytes of entropy
[info] Init: Generating temporary RSA private keys (512/1024 bits)
[info] Init: Generating temporary DH parameters (512/1024 bits)
[debug] ssl_scache_shmcb.c(374): shmcb_init allocated 512000 bytes of shared memory
[debug] ssl_scache_shmcb.c(554): entered shmcb_init_memory()
[debug] ssl_scache_shmcb.c(576): for 512000 bytes, recommending 4265 indexes
[debug] ssl_scache_shmcb.c(619): shmcb_init_memory choices follow
[debug] ssl_scache_shmcb.c(621): division_mask = 0x1F
[debug] ssl_scache_shmcb.c(623): division_offset = 96
[debug] ssl_scache_shmcb.c(625): division_size = 15997
[debug] ssl_scache_shmcb.c(627): queue_size = 2136
[debug] ssl_scache_shmcb.c(629): index_num = 133
[debug] ssl_scache_shmcb.c(631): index_offset = 8
[debug] ssl_scache_shmcb.c(633): index_size = 16
[debug] ssl_scache_shmcb.c(635): cache_data_offset = 8
[debug] ssl_scache_shmcb.c(637): cache_data_size = 13853
[debug] ssl_scache_shmcb.c(650): leaving shmcb_init_memory()

答案1

尝试将 %P(和 %D)添加到您的日志文件 - 然后您应该能够将“top”中看到的内容与您的访问日志关联起来。

答案2

[通知] 子进程 pid 22782 退出信号分段错误 (11)

这里肯定出了问题,您应该将其添加ulimit -c unlimited到开头,/etc/init.d/httpd以便在下次出现段错误时获取核心转储。mod_jk 可能是问题的根源,因为日志中有一个与 mod_jk 相关的错误。

答案3

我在列表中看到了 mod_perl。这个站点是用 PERL 编写的应用程序吗?如果是这样,那么编写不当的 PERL 代码就是问题的根源。

同样的评价也适用于 PHP。PHP 应用程序不以性能著称,而 CMS 应用程序则以资源消耗大而闻名。如果您是托管服务提供商,最好禁止此 CMS 软件包或收取更高的费用以弥补额外的资源。

但是,如果您是为了自己使用而运行此 CMS,由于它是开源的,您应该在 StackOverflow 上发布另一个问题,命名软件包并询问如何追踪和修复编写不当的代码。

答案4

我再也没有看到分段错误,但我仍然看到来自 httpd 的高 CPU 使用率。我能够对具有 CPU 的 httpd 进程运行 strace,并得到以下结果:

   # strace -c -p 28964
    Process 28964 attached - interrupt to quit
    ^CProcess 28964 detached
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     88.94    0.006093           0     98299      4562 lstat
      3.01    0.000206           0      2740           getcwd
      2.28    0.000156           0      2158         2 read
      2.26    0.000155           0       541        37 open
      1.68    0.000115           0      1321      1321 readlink
      1.52    0.000104           0      1678       822 access
      0.32    0.000022           0       502           fstat
      0.00    0.000000           0        25           write
      0.00    0.000000           0       507           close
      0.00    0.000000           0       547       478 stat
      0.00    0.000000           0        23           poll
      0.00    0.000000           0         2           rt_sigaction
      0.00    0.000000           0         2           rt_sigprocmask
      0.00    0.000000           0         2           writev
      0.00    0.000000           0         3           setitimer
      0.00    0.000000           0         1           sendfile
 ...
    ------ ----------- ----------- --------- --------- ----------------
    100.00    0.006851                108381      7224 total

lstat 中的 4562 错误是同一类型的错误,并在日志文件中显示如下:

# strace -f -t -o /var/log/strace.output -p 28964

strace.输出

28964 07:10:38 lstat("/var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
28964 07:10:38 lstat("/var/www", {st_mode=S_IFDIR|0755, st_size=94, ...}) = 0
28964 07:10:38 lstat("/var/www/vhosts", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
28964 07:10:38 lstat("/var/www/vhosts/example.com", {st_mode=S_IFDIR|0777, st_size=4096, ...}) = 0
28964 07:10:38 lstat("/var/www/vhosts/example.com/httpdocs", {st_mode=S_IFDIR|0777, st_size=4096, ...}) = 0
28964 07:10:38 lstat("/var/www/vhosts/example.com/httpdocs/sites", {st_mode=S_IFDIR|0755, st_size=30, ...}) = 0
28964 07:10:38 lstat("/var/www/vhosts/example.com/httpdocs/sites/all", {st_mode=S_IFDIR|0755, st_size=66, ...}) = 0
28964 07:10:38 lstat("/var/www/vhosts/example.com/httpdocs/sites/all/modules", {st_mode=S_IFDIR|0755, st_size=12288, ...}) = 0
28964 07:10:38 lstat("/var/www/vhosts/example.com/httpdocs/sites/all/modules/views", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
28964 07:10:38 lstat("/var/www/vhosts/example.com/httpdocs/sites/all/modules/views/includes", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
28964 07:10:38 lstat("/var/www/vhosts/example.com/httpdocs/sites/all/modules/views/includes/sites", 0x7fff1e627370) = -1 ENOENT (No such file or directory)

上面列出的文件夹都位于本网站目录中,是 Drupal CMS 的一部分。但是列出的最后一个

/var/www/vhosts/example.com/httpdocs/sites/all/modules/views/includes/sites 

不存在,而且实际上应该是

/var/www/vhosts/example.com/httpdocs/sites

它确实存在。看起来 lstat 正在尝试读取一个不存在的目录....?

-1 ENOENT (No such file or directory)

解决此问题并找出丢失目录错误根源的最佳方法是什么?

相关内容