最近我在使用 php-fpm 时遇到了一些问题,我注意到所有 php-fpm 的 IO 都非常高(99.9%),这就是导致 CPU 负载飙升的原因。
Top 命令
top - 06:28:53 up 8 days, 21:05, 2 users, load average: 179.61, 82.23, 70.63
Tasks: 913 total, 11 running, 901 sleeping, 0 stopped, 1 zombie
Cpu(s): 9.7%us, 1.7%sy, 0.0%ni, 31.4%id, 56.0%wa, 0.0%hi, 1.1%si, 0.0%st
Mem: 16296824k total, 9676012k used, 6620812k free, 242004k buffers
Swap: 8159228k total, 16k used, 8159212k free, 6596628k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
50283 nginx 20 0 439m 11m 6108 S 22.8 0.1 0:01.10 php-fpm
37536 nginx 20 0 440m 12m 6432 S 5.9 0.1 0:05.32 php-fpm
44007 nginx 20 0 439m 12m 6428 D 5.9 0.1 0:02.50 php-fpm
47827 nginx 20 0 442m 13m 6320 R 5.6 0.1 0:01.28 php-fpm
49263 nginx 20 0 439m 11m 6176 S 5.3 0.1 0:00.53 php-fpm
40709 nginx 20 0 440m 12m 6400 S 5.0 0.1 0:03.34 php-fpm
42967 nginx 20 0 439m 12m 6324 S 5.0 0.1 0:02.43 php-fpm
48703 nginx 20 0 439m 11m 5992 S 4.6 0.1 0:00.56 php-fpm
21532 nginx 20 0 121m 19m 2936 S 4.0 0.1 0:59.10 nginx
50684 nginx 20 0 439m 12m 6184 S 4.0 0.1 0:00.30 php-fpm
44081 nginx 20 0 439m 11m 6284 S 3.6 0.1 0:02.29 php-fpm
48760 nginx 20 0 440m 12m 6372 S 3.6 0.1 0:01.34 php-fpm
38657 nginx 20 0 440m 12m 6848 D 3.3 0.1 0:03.48 php-fpm
49899 nginx 20 0 439m 11m 6040 S 3.3 0.1 0:00.63 php-fpm
iotop 命令
488 be/4 nginx 0.00 B/s 0.00 B/s 0.00 % 99.99 % php-fpm: pool www
47360 be/4 nginx 0.00 B/s 3.66 K/s 0.00 % 99.99 % php-fpm: pool www
51005 be/4 nginx 0.00 B/s 3.66 K/s 0.00 % 99.99 % php-fpm: pool www
56126 be/4 nginx 0.00 B/s 3.66 K/s 0.00 % 82.48 % php-fpm: pool www
48028 be/4 nginx 0.00 B/s 3.66 K/s 0.00 % 54.14 % php-fpm: pool www
54876 be/4 nginx 0.00 B/s 0.00 B/s 0.00 % 48.00 % php-fpm: pool www
47651 be/4 nginx 0.00 B/s 3.66 K/s 0.00 % 37.67 % php-fpm: pool www
839 be/4 root 0.00 B/s 0.00 B/s 0.00 % 37.30 % [kmirrord]
47811 be/4 nginx 0.00 B/s 7.31 K/s 0.00 % 36.73 % php-fpm: pool www
48014 be/4 nginx 0.00 B/s 3.66 K/s 0.00 % 35.81 % php-fpm: pool www
48050 be/4 nginx 0.00 B/s 0.00 B/s 0.00 % 34.42 % php-fpm: pool www
47987 be/4 nginx 0.00 B/s 3.66 K/s 0.00 % 32.06 % php-fpm: pool www
47468 be/4 nginx 0.00 B/s 3.66 K/s 0.00 % 27.39 % php-fpm: pool www
/etc/sysctl.conf 设置
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_synack_retries = 2
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.default.secure_redirects = 0
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv6.conf.default.router_solicitations = 0
net.ipv6.conf.default.accept_ra_rtr_pref = 0
net.ipv6.conf.default.accept_ra_pinfo = 0
net.ipv6.conf.default.accept_ra_defrtr = 0
net.ipv6.conf.default.autoconf = 0
net.ipv6.conf.default.dad_transmits = 0
net.ipv6.conf.default.max_addresses = 1
kernel.exec-shield = 1
kernel.randomize_va_space = 1
fs.file-max = 65535
kernel.pid_max = 65536
net.ipv4.ip_local_port_range = 2000 65000
/etc/php-fpm.conf
include=/etc/php-fpm.d/*.conf
[global]
pid = /var/run/php-fpm/php-fpm.pid
error_log = /var/log/php-fpm/error.log
log_level = warning
emergency_restart_threshold = 10
emergency_restart_interval = 1m
process_control_timeout = 10
daemonize = yes
/etc/php.d/www.conf
[www]
listen = /tmp/php5-fpm.sock
listen.allowed_clients = 127.0.0.1
listen.mode = 0666
user = nginx
group = nginx
pm = static
pm.max_children = 800
pm.start_servers = 225
pm.min_spare_servers = 150
pm.max_spare_servers = 800
pm.max_requests = 5000
request_terminate_timeout = 300
request_slowlog_timeout = 120s
slowlog = /var/log/php-fpm/www-slow.log
php_admin_value[error_log] = /var/log/php-fpm/www-error.log
php_admin_flag[log_errors] = on
php_value[session.save_handler] = files
php_value[session.save_path] = /var/lib/php/session
php_value[soap.wsdl_cache_dir] = /var/lib/php/wsdlcache
我尝试了很多以下设置来解决这个问题,但没有成功
pm = static //was dynamic before
pm.max_children = 800
pm.start_servers = 225
pm.min_spare_servers = 150
pm.max_spare_servers = 800
pm.max_requests = 5000 // was 500 before
每个 php-fpm 使用大约 10-15MB 的 RAM,我有 16GB 的 RAM。我使用以下命令检查每个进程使用了多少内存
ps --no-headers -o "rss,cmd" -C php-fpm | awk '{ sum+=$1 } END { printf ("%d%s\n", sum/NR/1024,"M") }'
我的网站在专用服务器上运行,仅使用Nginx+PHP-FPM
。根据 Google Analytics ,该网站每天的页面浏览量约为 90K。我禁用了 nginx 访问日志,并将更改log_level
为。notice
warning
编辑
iostat -p -xdmnh 5
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 2.20 2044.20 86.60 118.40 3.31 8.45 117.42 4.11 20.03 1.70 34.76
sdb 0.00 2032.40 0.00 130.20 0.00 8.45 132.87 2.39 18.32 0.92 12.02
dm-0 0.00 0.00 88.80 2162.60 3.31 8.45 10.69 282.23 125.36 0.17 37.16
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 88.80 2162.60 3.31 8.45 10.69 282.23 125.36 0.17 37.26
dm-3 0.00 0.00 88.80 2162.40 3.31 8.45 10.69 282.24 125.37 0.17 37.28
dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Filesystem: rMB_nor/s wMB_nor/s rMB_dir/s wMB_dir/s rMB_svr/s wMB_svr/s ops/s rops/s wops/s
编辑2 lsblk 命令 感谢 Xavier Lucas 尽力帮助我解决这个问题
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 931.5G 0 disk
└─ddf1_4c534920202020201000007910009260471147116cfe0677 (dm-0) 253:0 0 931G 0 dmraid
├─ddf1_4c534920202020201000007910009260471147116cfe0677p1 (dm-1) 253:1 0 500M 0 part /boot
└─ddf1_4c534920202020201000007910009260471147116cfe0677p2 (dm-2) 253:2 0 930.5G 0 part
├─vg_485067-lv_root (dm-3) 253:3 0 50G 0 lvm /
├─vg_485067-lv_swap (dm-4) 253:4 0 7.8G 0 lvm [SWAP]
└─vg_485067-lv_home (dm-5) 253:5 0 872.7G 0 lvm /home
sdb 8:16 0 931.5G 0 disk
└─ddf1_4c534920202020201000007910009260471147116cfe0677 (dm-0) 253:0 0 931G 0 dmraid
├─ddf1_4c534920202020201000007910009260471147116cfe0677p1 (dm-1) 253:1 0 500M 0 part /boot
└─ddf1_4c534920202020201000007910009260471147116cfe0677p2 (dm-2) 253:2 0 930.5G 0 part
├─vg_485067-lv_root (dm-3) 253:3 0 50G 0 lvm /
├─vg_485067-lv_swap (dm-4) 253:4 0 7.8G 0 lvm [SWAP]
└─vg_485067-lv_home (dm-5) 253:5 0 872.7G 0 lvm /home
但是,当我尝试以下操作时,我根本没有得到任何输出(没有任何错误消息。
diff <(lsof +D /home) <(sleep 60 ; lsof +D /home)
编辑3(再次感谢Xavier Lucas)
diff <(lsof +D /) <(sleep 60 ; lsof +D /)
结果如下
是的,我的网站使用了命令行工具,但我升级了 CPU 和 RAM,以便能够支持该工具。此外,命令行工具不会下载任何内容,所以我不明白为什么 IO 很高
Dual Xeon E5-2620
我在 1gbps 专用端口上使用16gb 的 RAM。
答案1
按照您的要求,您可以按照以下方式对指定挂载点的文件活动进行差异分析。
首先,使用命令获取 iostat 识别的设备的挂载点lsblk
。
结果如下:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 16G 0 disk
├─sda1 8:1 0 200M 0 part /boot
└─sda2 8:2 0 15.8G 0 part
├─vg00-root (dm-0) 253:0 0 1G 0 lvm /
├─vg00-swap (dm-1) 253:1 0 1G 0 lvm [SWAP]
├─vg00-usr (dm-2) 253:2 0 5G 0 lvm /usr
├─vg00-var (dm-3) 253:3 0 2G 0 lvm /var
├─vg00-log (dm-4) 253:4 0 1G 0 lvm /var/log
├─vg00-tmp (dm-5) 253:5 0 2G 0 lvm /tmp
├─vg00-home (dm-6) 253:6 0 1G 0 lvm /home
├─vg00-rpm (dm-7) 253:7 0 1G 0 lvm /var/lib/rpm
然后在高 IO 活动期间运行:
diff <(lsof +D <mountpoint>) <(sleep <delay_seconds> ; lsof +D <mountpoint>)
例如,对于我的 dm-2 挂载点,间隔为 60 秒:
diff <(lsof +D /usr) <(sleep 60 ; lsof +D /usr)
然后它将输出文件列表之间的差异,检查大小/偏移量列以查看哪些增长。
答案2
您的问题可能是由 php5-fpm.sock 的高命中率引起的,您可以通过更改 nginx 连接 php-fpm 的方式来测试这个问题,首先在您的 www.conf 注释中
listen = /tmp/php5-fpm.sock
并添加
listen = 127.0.0.1:9000
在 nginx 中
fastcgi_pass 127.0.0.1:9000;
而不是
unix:/tmp/php5-fpm.sock
希望有帮助