使用 php-fpm 的高 io

使用 php-fpm 的高 io

最近我在使用 php-fpm 时遇到了一些问题,我注意到所有 php-fpm 的 IO 都非常高(99.9%),这就是导致 CPU 负载飙升的原因。

Top 命令

top - 06:28:53 up 8 days, 21:05,  2 users,  load average: 179.61, 82.23, 70.63
Tasks: 913 total,  11 running, 901 sleeping,   0 stopped,   1 zombie
Cpu(s):  9.7%us,  1.7%sy,  0.0%ni, 31.4%id, 56.0%wa,  0.0%hi,  1.1%si,  0.0%st
Mem:  16296824k total,  9676012k used,  6620812k free,   242004k buffers
Swap:  8159228k total,       16k used,  8159212k free,  6596628k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
50283 nginx     20   0  439m  11m 6108 S 22.8  0.1   0:01.10 php-fpm
37536 nginx     20   0  440m  12m 6432 S  5.9  0.1   0:05.32 php-fpm
44007 nginx     20   0  439m  12m 6428 D  5.9  0.1   0:02.50 php-fpm
47827 nginx     20   0  442m  13m 6320 R  5.6  0.1   0:01.28 php-fpm
49263 nginx     20   0  439m  11m 6176 S  5.3  0.1   0:00.53 php-fpm
40709 nginx     20   0  440m  12m 6400 S  5.0  0.1   0:03.34 php-fpm
42967 nginx     20   0  439m  12m 6324 S  5.0  0.1   0:02.43 php-fpm
48703 nginx     20   0  439m  11m 5992 S  4.6  0.1   0:00.56 php-fpm
21532 nginx     20   0  121m  19m 2936 S  4.0  0.1   0:59.10 nginx
50684 nginx     20   0  439m  12m 6184 S  4.0  0.1   0:00.30 php-fpm
44081 nginx     20   0  439m  11m 6284 S  3.6  0.1   0:02.29 php-fpm
48760 nginx     20   0  440m  12m 6372 S  3.6  0.1   0:01.34 php-fpm
38657 nginx     20   0  440m  12m 6848 D  3.3  0.1   0:03.48 php-fpm
49899 nginx     20   0  439m  11m 6040 S  3.3  0.1   0:00.63 php-fpm

iotop 命令

488 be/4 nginx       0.00 B/s    0.00 B/s  0.00 % 99.99 % php-fpm: pool www
47360 be/4 nginx       0.00 B/s    3.66 K/s  0.00 % 99.99 % php-fpm: pool www
51005 be/4 nginx       0.00 B/s    3.66 K/s  0.00 % 99.99 % php-fpm: pool www
56126 be/4 nginx       0.00 B/s    3.66 K/s  0.00 % 82.48 % php-fpm: pool www
48028 be/4 nginx       0.00 B/s    3.66 K/s  0.00 % 54.14 % php-fpm: pool www
54876 be/4 nginx       0.00 B/s    0.00 B/s  0.00 % 48.00 % php-fpm: pool www
47651 be/4 nginx       0.00 B/s    3.66 K/s  0.00 % 37.67 % php-fpm: pool www
  839 be/4 root        0.00 B/s    0.00 B/s  0.00 % 37.30 % [kmirrord]
47811 be/4 nginx       0.00 B/s    7.31 K/s  0.00 % 36.73 % php-fpm: pool www
48014 be/4 nginx       0.00 B/s    3.66 K/s  0.00 % 35.81 % php-fpm: pool www
48050 be/4 nginx       0.00 B/s    0.00 B/s  0.00 % 34.42 % php-fpm: pool www
47987 be/4 nginx       0.00 B/s    3.66 K/s  0.00 % 32.06 % php-fpm: pool www
47468 be/4 nginx       0.00 B/s    3.66 K/s  0.00 % 27.39 % php-fpm: pool www

/etc/sysctl.conf 设置

net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_synack_retries = 2
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.default.secure_redirects = 0
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv6.conf.default.router_solicitations = 0
net.ipv6.conf.default.accept_ra_rtr_pref = 0
net.ipv6.conf.default.accept_ra_pinfo = 0
net.ipv6.conf.default.accept_ra_defrtr = 0
net.ipv6.conf.default.autoconf = 0
net.ipv6.conf.default.dad_transmits = 0
net.ipv6.conf.default.max_addresses = 1
kernel.exec-shield = 1
kernel.randomize_va_space = 1
fs.file-max = 65535
kernel.pid_max = 65536
net.ipv4.ip_local_port_range = 2000 65000

/etc/php-fpm.conf

include=/etc/php-fpm.d/*.conf
[global]
pid = /var/run/php-fpm/php-fpm.pid
error_log = /var/log/php-fpm/error.log
log_level = warning
emergency_restart_threshold = 10
emergency_restart_interval = 1m
process_control_timeout = 10
daemonize = yes

/etc/php.d/www.conf

[www]
listen = /tmp/php5-fpm.sock
listen.allowed_clients = 127.0.0.1
listen.mode = 0666
user = nginx
group = nginx
pm = static
pm.max_children = 800
pm.start_servers = 225
pm.min_spare_servers = 150
pm.max_spare_servers = 800
pm.max_requests = 5000
request_terminate_timeout = 300
request_slowlog_timeout = 120s
slowlog = /var/log/php-fpm/www-slow.log
php_admin_value[error_log] = /var/log/php-fpm/www-error.log
php_admin_flag[log_errors] = on
php_value[session.save_handler] = files
php_value[session.save_path]    = /var/lib/php/session
php_value[soap.wsdl_cache_dir]  = /var/lib/php/wsdlcache

我尝试了很多以下设置来解决这个问题,但没有成功

pm = static //was dynamic before
pm.max_children = 800
pm.start_servers = 225
pm.min_spare_servers = 150
pm.max_spare_servers = 800
pm.max_requests = 5000 // was 500 before

每个 php-fpm 使用大约 10-15MB 的 RAM,我有 16GB 的 RAM。我使用以下命令检查每个进程使用了​​多少内存

 ps --no-headers -o "rss,cmd" -C php-fpm | awk '{ sum+=$1 } END { printf ("%d%s\n", sum/NR/1024,"M") }'

我的网站在专用服务器上运行,仅使用Nginx+PHP-FPM。根据 Google Analytics ,该网站每天的页面浏览量约为 90K。我禁用了 nginx 访问日志,并将更改log_level为。noticewarning

编辑

iostat -p -xdmnh 5

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               2.20  2044.20   86.60  118.40     3.31     8.45   117.42     4.11   20.03   1.70  34.76
sdb               0.00  2032.40    0.00  130.20     0.00     8.45   132.87     2.39   18.32   0.92  12.02
dm-0              0.00     0.00   88.80 2162.60     3.31     8.45    10.69   282.23  125.36   0.17  37.16
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00   88.80 2162.60     3.31     8.45    10.69   282.23  125.36   0.17  37.26
dm-3              0.00     0.00   88.80 2162.40     3.31     8.45    10.69   282.24  125.37   0.17  37.28
dm-4              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-5              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

Filesystem:               rMB_nor/s    wMB_nor/s    rMB_dir/s    wMB_dir/s    rMB_svr/s    wMB_svr/s     ops/s    rops/s    wops/s

编辑2 lsblk 命令 感谢 Xavier Lucas 尽力帮助我解决这个问题

  NAME                                                               MAJ:MIN RM   SIZE RO TYPE   MOUNTPOINT
    sda                                                                  8:0    0 931.5G  0 disk
    └─ddf1_4c534920202020201000007910009260471147116cfe0677 (dm-0)     253:0    0   931G  0 dmraid
      ├─ddf1_4c534920202020201000007910009260471147116cfe0677p1 (dm-1) 253:1    0   500M  0 part   /boot
      └─ddf1_4c534920202020201000007910009260471147116cfe0677p2 (dm-2) 253:2    0 930.5G  0 part
        ├─vg_485067-lv_root (dm-3)                                     253:3    0    50G  0 lvm    /
        ├─vg_485067-lv_swap (dm-4)                                     253:4    0   7.8G  0 lvm    [SWAP]
        └─vg_485067-lv_home (dm-5)                                     253:5    0 872.7G  0 lvm    /home
    sdb                                                                  8:16   0 931.5G  0 disk
    └─ddf1_4c534920202020201000007910009260471147116cfe0677 (dm-0)     253:0    0   931G  0 dmraid
      ├─ddf1_4c534920202020201000007910009260471147116cfe0677p1 (dm-1) 253:1    0   500M  0 part   /boot
      └─ddf1_4c534920202020201000007910009260471147116cfe0677p2 (dm-2) 253:2    0 930.5G  0 part
        ├─vg_485067-lv_root (dm-3)                                     253:3    0    50G  0 lvm    /
        ├─vg_485067-lv_swap (dm-4)                                     253:4    0   7.8G  0 lvm    [SWAP]
        └─vg_485067-lv_home (dm-5)                                     253:5    0 872.7G  0 lvm    /home

但是,当我尝试以下操作时,我根本没有得到任何输出(没有任何错误消息。

diff <(lsof +D /home) <(sleep 60 ; lsof +D /home)

编辑3(再次感谢Xavier Lucas)

 diff <(lsof +D /) <(sleep 60 ; lsof +D /)

结果如下

http://pastebin.com/vhCtQynu

是的,我的网站使用了命令行工具,但我升级了 CPU 和 RAM,以便能够支持该工具。此外,命令行工具不会下载任何内容,所以我不明白为什么 IO 很高

Dual Xeon E5-2620我在 1gbps 专用端口上使用16gb 的 RAM。

答案1

按照您的要求,您可以按照以下方式对指定挂载点的文件活动进行差异分析。

首先,使用命令获取 iostat 识别的设备的挂载点lsblk

结果如下:

NAME                       MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                          8:0    0   16G  0 disk 
├─sda1                       8:1    0  200M  0 part /boot
└─sda2                       8:2    0 15.8G  0 part 
  ├─vg00-root (dm-0)       253:0    0    1G  0 lvm  /
  ├─vg00-swap (dm-1)       253:1    0    1G  0 lvm  [SWAP]
  ├─vg00-usr (dm-2)        253:2    0    5G  0 lvm  /usr
  ├─vg00-var (dm-3)        253:3    0    2G  0 lvm  /var
  ├─vg00-log (dm-4)        253:4    0    1G  0 lvm  /var/log
  ├─vg00-tmp (dm-5)        253:5    0    2G  0 lvm  /tmp
  ├─vg00-home (dm-6)       253:6    0    1G  0 lvm  /home
  ├─vg00-rpm (dm-7)        253:7    0    1G  0 lvm  /var/lib/rpm

然后在高 IO 活动期间运行:

diff <(lsof +D <mountpoint>) <(sleep <delay_seconds> ; lsof +D <mountpoint>)

例如,对于我的 dm-2 挂载点,间隔为 60 秒:

diff <(lsof +D /usr) <(sleep 60 ; lsof +D /usr)

然后它将输出文件列表之间的差异,检查大小/偏移量列以查看哪些增长。

答案2

您的问题可能是由 php5-fpm.sock 的高命中率引起的,您可以通过更改 nginx 连接 php-fpm 的方式来测试这个问题,首先在您的 www.conf 注释中

listen = /tmp/php5-fpm.sock

并添加

listen = 127.0.0.1:9000

在 nginx 中

fastcgi_pass 127.0.0.1:9000;

而不是

unix:/tmp/php5-fpm.sock

希望有帮助

相关内容