哪个 php5-fpm 设置适合高并发连接数 + nginx

哪个 php5-fpm 设置适合高并发连接数 + nginx

请帮助我调整 php5-fpm 和 nginx 的配置。

问题是我的 php5-fpm 日志不断报告脚本运行缓慢并终止子线程。

专用服务器,四核至强处理器,32Gb 内存。正在运行 1 个 php 应用程序/站点。

Php应用程序:简而言之,搜索引擎结果会触发 curl 请求。每次搜索的页面加载时间通常为 2 - 3 秒。

我认为正在发生以下情况:

pm.max_children = 400我有 750 个并发 php 用户执行搜索。由于 RAM 限制,我只能设置。我假设每个用户 50Mb(子线程),所以 = 20GB。我假设每个用户 = 1 个子线程。因此,pm.max_children不足以覆盖 750 个正在进行搜索且需要 3 秒的活跃 php 用户。

所以我认为我看到用户排队,因为我看到 3 秒变成了 4 - 7 秒。随着用户排队,我认为脚本变慢了,触发了错误日志消息,php5-fpm pm 杀死了子进程?

我认为这就是正在发生的事情。我在下面提供了我的错误日志输出、nginx、php5-fpm 配置。

我非常感激任何建议,如果我可以调整我的配置,并且确实pm.max_children应该至少等于最大并发用户数,请记住我的 php 搜索打开时间约为 3 秒。我是否需要更多内存或更多服务器?

这是我的记忆,但我大约 30 分钟前才重启过 nginx

:/var/log# free -m
             total       used       free     shared    buffers     cached
Mem:         32151      26175       5975          0        186      13334
-/+ buffers/cache:      12654      19496
Swap:        32739          5      32734

php5-fpm:www.conf:进程管理器设置为静态

我使用静态是因为我认为所有子项都会立即可用,而不是生成时间,并且我在盒子上只运行 1 个应用程序。

;pm = dynamic
pm = static

;pm.max_children = 10
pm.max_children = 400


;pm.start_servers = 4
pm.start_servers = 150


;pm.min_spare_servers = 2
pm.min_spare_servers = 32


;pm.max_spare_servers = 6
pm.max_spare_servers = 64


;pm.max_requests = 500
pm.max_requests = 10000

php5-fpm 日志中的错误

我应该澄清一下,在高负载下(750 个用户同时在线),我看到的行为是缓存和未缓存的搜索结果开始花费更长时间。例如,缓存搜索结果需要 1 秒以上,未缓存搜索结果需要 4 到 7 秒。因此,随着用户排队等待,我认为搜索时间会增加,并且会逐渐增加到脚本在负载下运行缓慢的程度。触发通知,子进程就会被杀死。

例如这是重启后刚刚发生的情况

[04-Jun-2013 20:11:07] NOTICE: Finishing ...
[04-Jun-2013 20:11:11] NOTICE: exiting, bye-bye!
[04-Jun-2013 20:11:12] NOTICE: fpm is running, pid 17899
[04-Jun-2013 20:11:12] NOTICE: ready to handle connections
[04-Jun-2013 20:27:28] WARNING: [pool www] child 18200, script '/home/site/public_html/index.php' (request: "POST /index.php") executing too slow (10.827363 sec), logging
[04-Jun-2013 20:27:28] WARNING: [pool www] child 18138, script '/home/site/public_html/index.php' (request: "POST /index.php") executing too slow (10.827034 sec), logging
[04-Jun-2013 20:27:28] NOTICE: child 18138 stopped for tracing
[04-Jun-2013 20:27:28] NOTICE: about to trace 18138
[04-Jun-2013 20:27:28] NOTICE: finished trace of 18138
[04-Jun-2013 20:27:28] NOTICE: child 18200 stopped for tracing
[04-Jun-2013 20:27:28] NOTICE: about to trace 18200
[04-Jun-2013 20:27:28] NOTICE: finished trace of 18200
[04-Jun-2013 20:52:52] WARNING: [pool www] child 17948, script '/home/site/public_html/index.php' (request: "GET /index.php") executing too slow (11.724081 sec), logging
[04-Jun-2013 20:52:52] NOTICE: child 17948 stopped for tracing
[04-Jun-2013 20:52:52] NOTICE: about to trace 17948
[04-Jun-2013 20:52:52] ERROR: failed to ptrace(PEEKDATA) pid 17948: Input/output error (5)
[04-Jun-2013 20:52:52] NOTICE: finished trace of 17948
[04-Jun-2013 20:58:22] WARNING: [pool www] child 18287, script '/home/site/public_html/index.php' (request: "POST /index.php") executing too slow (10.701504 sec), logging
[04-Jun-2013 20:58:22] NOTICE: child 18287 stopped for tracing
[04-Jun-2013 20:58:22] NOTICE: about to trace 18287
[04-Jun-2013 20:58:22] NOTICE: finished trace of 18287
[04-Jun-2013 21:19:22] WARNING: [pool www] child 18224, script '/home/site/public_html/index.php' (request: "GET /index.php") executing too slow (10.005466 sec), logging
[04-Jun-2013 21:19:22] WARNING: [pool www] child 18197, script '/home/site/public_html/index.php' (request: "GET /index.php") executing too slow (12.141221 sec), logging
[04-Jun-2013 21:19:22] WARNING: [pool www] child 17946, script '/home/site/public_html/index.php' (request: "GET /index.php") executing too slow (11.107080 sec), logging
[04-Jun-2013 21:19:22] NOTICE: child 17946 stopped for tracing
[04-Jun-2013 21:19:22] NOTICE: about to trace 17946
[04-Jun-2013 21:19:22] NOTICE: finished trace of 17946
[04-Jun-2013 21:19:22] NOTICE: child 18197 stopped for tracing
[04-Jun-2013 21:19:22] NOTICE: about to trace 18197
[04-Jun-2013 21:19:22] NOTICE: finished trace of 18197
[04-Jun-2013 21:19:22] NOTICE: child 18224 stopped for tracing
[04-Jun-2013 21:19:22] NOTICE: about to trace 18224
[04-Jun-2013 21:19:22] NOTICE: finished trace of 18224
[04-Jun-2013 21:19:26] WARNING: [pool www] child 18197, script '/home/site/public_html/index.php' (request: "GET /index.php") execution timed out (15.475021 sec), terminating
[04-Jun-2013 21:19:26] WARNING: [pool www] child 18055, script '/home/site/public_html/index.php' (request: "GET /index.php") executing too slow (12.927407 sec), logging
[04-Jun-2013 21:19:26] NOTICE: child 18055 stopped for tracing
[04-Jun-2013 21:19:26] NOTICE: about to trace 18055
[04-Jun-2013 21:19:26] NOTICE: finished trace of 18055
[04-Jun-2013 21:19:26] WARNING: [pool www] child 18197 exited on signal 15 (SIGTERM) after 4094.193190 seconds from start
[04-Jun-2013 21:19:26] NOTICE: [pool www] child 5137 started
[04-Jun-2013 21:24:49] WARNING: [pool www] child 17918, script '/home/site/public_html/index.php' (request: "GET /index.php") executing too slow (11.367854 sec), logging
[04-Jun-2013 21:24:49] NOTICE: child 17918 stopped for tracing
[04-Jun-2013 21:24:49] NOTICE: about to trace 17918
[04-Jun-2013 21:24:49] NOTICE: finished trace of 17918
[04-Jun-2013 21:24:53] WARNING: [pool www] child 18226, script '/home/site/public_html/index.php' (request: "GET /index.php") executing too slow (10.763667 sec), logging
[04-Jun-2013 21:24:53] WARNING: [pool www] child 18206, script '/home/site/public_html/index.php' (request: "GET /index.php") executing too slow (12.060464 sec), logging
[04-Jun-2013 21:24:53] WARNING: [pool www] child 18073, script '/home/site/public_html/index.php' (request: "GET /index.php") executing too slow (11.846097 sec), logging
[04-Jun-2013 21:24:53] NOTICE: child 18073 stopped for tracing
[04-Jun-2013 21:24:53] NOTICE: about to trace 18073
[04-Jun-2013 21:24:53] NOTICE: finished trace of 18073
[04-Jun-2013 21:24:53] NOTICE: child 18206 stopped for tracing
[04-Jun-2013 21:24:53] NOTICE: about to trace 18206
[04-Jun-2013 21:24:53] NOTICE: finished trace of 18206
[04-Jun-2013 21:24:53] NOTICE: child 18226 stopped for tracing
[04-Jun-2013 21:24:53] NOTICE: about to trace 18226
[04-Jun-2013 21:24:53] NOTICE: finished trace of 18226
[04-Jun-2013 21:24:56] WARNING: [pool www] child 5137, script '/home/site/public_html/index.php' (request: "GET /index.php") executing too slow (12.055624 sec), logging
[04-Jun-2013 21:24:56] WARNING: [pool www] child 18206, script '/home/site/public_html/index.php' (request: "GET /index.php") execution timed out (15.395149 sec), terminating
[04-Jun-2013 21:24:56] WARNING: [pool www] child 17996, script '/home/site/public_html/index.php' (request: "GET /index.php") executing too slow (12.145728 sec), logging
[04-Jun-2013 21:24:56] WARNING: [pool www] child 17918, script '/home/site/public_html/index.php' (request: "GET /index.php") execution timed out (18.036700 sec), terminating
[04-Jun-2013 21:24:56] NOTICE: child 17996 stopped for tracing
[04-Jun-2013 21:24:56] NOTICE: about to trace 17996
[04-Jun-2013 21:24:56] NOTICE: finished trace of 17996
[04-Jun-2013 21:24:56] NOTICE: child 5137 stopped for tracing
[04-Jun-2013 21:24:56] NOTICE: about to trace 5137
[04-Jun-2013 21:24:56] NOTICE: finished trace of 5137
[04-Jun-2013 21:24:56] WARNING: [pool www] child 17918 exited on signal 15 (SIGTERM) after 4424.343036 seconds from start
[04-Jun-2013 21:24:56] NOTICE: [pool www] child 6706 started
[04-Jun-2013 21:24:56] WARNING: [pool www] child 18206 exited on signal 15 (SIGTERM) after 4424.264130 seconds from start
[04-Jun-2013 21:24:56] NOTICE: [pool www] child 6707 started
[04-Jun-2013 21:24:59] WARNING: [pool www] child 17996, script '/home/site/public_html/index.php' (request: "GET /index.php") execution timed out (15.479201 sec), terminating
[04-Jun-2013 21:24:59] WARNING: [pool www] child 17996 exited on signal 15 (SIGTERM) after 4427.655572 seconds from start
[04-Jun-2013 21:24:59] NOTICE: [pool www] child 6708 started

这是我的 nginx 配置

user www-data;
worker_processes 4;
pid /var/run/nginx.pid;
worker_rlimit_nofile 20000;

events {
    #worker_connections 768;
    #worker_connections 19000;

    #multi_accept on;
    use epoll;
    #worker_connections 10240;  
    worker_connections 4096;
}


http {

    ##
    # Basic Settings
    ##

    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    #keepalive_timeout 65;
    #keepalive_timeout 5;

#added
    client_body_timeout   15;
    client_header_timeout 15;
    keepalive_timeout     15;
    send_timeout          15;

站点配置文件

proxy_buffer_size   128k;
    proxy_buffers   4 256k;
    proxy_busy_buffers_size   256k;

    fastcgi_connect_timeout 60;
    fastcgi_send_timeout 180;
    fastcgi_read_timeout 180;
    fastcgi_buffer_size 128k;
    fastcgi_buffers 4 256k;
    #fastcgi_buffers 256 16k; #4096k total
    fastcgi_busy_buffers_size 256k;
    fastcgi_temp_file_write_size 256k;
    fastcgi_intercept_errors on;

php5-fpm 正在通过 TCP 端口连接

谢谢

答案1

我认为您可能运行了太多并发 php 进程,但如果没有更多有关资源瓶颈的信息,很难知道。我猜想您可能受到磁盘 IO 和/或 CPU 的限制,并且所有并行 PHP 进程都在争夺这些资源并相互拖慢速度。在某些时候,进程切换的开销成为一个重要因素,并且通过运行大量进程,您获得的吞吐量会减少而不是增加。您还可能会陷入或冒着耗尽 RAM 并开始交换的风险,这非常糟糕。相信 nginx 能够排队请求并保持更快请求的更高吞吐量,同时减少同时执行的请求数量。

我通常会选择 5 到 50 个 PHP 进程,这个范围的两端都有点特殊。通常为 10-15。非常高性能磁盘系统,以及超过通常的 16 个左右的核心,拥有更多进程可能是有意义的,但与拥有大量更便宜的服务器相比,这通常是一种虚假的经济。根据我的经验,除非你有很多写得很差的代码,否则在单个服务器上并行运行超过 15 个 php 进程通常没有什么好处,如果有好处,那很可能是稳定性而不是吞吐量,因为病态的长时间运行的请求堆积起来,没有多余的进程可用。

如果您拥有多个带有独立进程池的代码库,您可能需要大量进程,但可能不希望每个池有超过 3 到 5 个进程。

您确实需要大量 nginx 工作连接来处理静态文件。超过 4096 就不太可能有任何改进,只有在特殊情况下您才会看到 1000 和 4000 之间的差异。(除非您主要提供静态文件 - 这是一个完全不同的场景,但由于您谈论的是此框中的 php 进程,我不认为这里的情况是这样的)。

我怀疑你的超时时间太长了。如果没有发生任何事情,请断开连接并继续下一个。

答案2

1)内存——我首先要考虑的是,如果您的脚本所做的只是一个简单的搜索,为什么它们需要 50MB 的内存——我假设如果您每秒处理数百个请求,您实际上并没有为每个用户返回数兆字节的数据。

有一个MySQL 连接器库中的错误这使得 PHP 为任何 TEXT 或 BLOB 分配最大可能大小,而不仅仅是实际需要的内存量。可以通过移至 MySQLND 库来修复此问题,无需更改代码。

2) 您的 pm.max_requests = 10000 设置可能不是一个很好的选择。如果每个请求需要 2 秒,那么您就是在告诉进程管理器在 20,000 秒或近 6 小时后重新启动每个进程。这似乎是一个非常长的时间,足以让任何内存泄漏导致进程停止。将其恢复到 500 仍然只是每 15 分钟重新启动一次,这对性能没有影响,但可能会更稳定。

3) 正如 Michael 所说,即使您能够允许与用户连接数量一样多的进程,您仍然需要找出瓶颈实际上在哪里。即使您同时拥有数百个 PHP 进程,如果它们都在等待 SQL 服务器可用,那么它们将始终排队等待并最终开始超时。

除非您可以消除瓶颈,否则您将需要实施速率限制机制,以仅允许服务器设置可以处理的查询数量,或者优雅降级以拒绝服务器当前无法处理的请求。

答案3

我暂时还不能发表评论(代表人数不足),因此我将发布一个答案:如果有您的 nginx 日志就更好了。

关于您的 pool.d/www 配置:如果您将 pm 置于静态,则大多数变量都不会产生任何影响,max_children 将主要影响您的设置。(http://php.net/manual/en/install.fpm.configuration.php)或许可以尝试使用 PM“ondemand”。

您不应该从那么高的 pm.start_servers 开始。您还应该降低 min_spare_servers。

关于你的 nginx 配置: http://wiki.nginx.org/CoreModule#worker_processes “最大客户端数 = 工作进程数 * 工作连接数”

您的值“worker_rlimit_nofile”在我看来不正确。

看着http://wiki.nginx.org/CoreModule#worker_cpu_affinity使用所有的核心。

答案4

如果其他一切都失败了……我想也许你可以用代码来处理这个问题。你可以创建一个“票务系统”来允许同时进行一定数量的搜索,并为用户提供大致的等待时间。比如“你的搜索将在 N 秒后开始”。

相关内容