Nginx + php-fpm“504 网关超时”错误,几乎为零负载(在测试服务器上)

Nginx + php-fpm“504 网关超时”错误,几乎为零负载(在测试服务器上)

经过 6 个小时的调试后,我放弃了:|

我们在局域网中有一个 nginx+php-fpm+mysql,其中有近 100 个 wordpress(由不同的设计师/开发人员创建和使用,他们都在测试 wordpres 设置)

我们使用 nginx 很长时间了,没有任何问题。

今天,突然间 - nginx 开始返回“504 网关超时”......

我检查了虚拟主机的 nginx 错误日志......

2010/09/06 21:24:24 [error] 12909#0: *349 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 21:25:11 [error] 12909#0: *349 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 21:25:11 [error] 12909#0: *443 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /info.php HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 21:25:12 [error] 12909#0: *443 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:08:32 [error] 12909#0: *1025 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:09:33 [error] 12909#0: *1025 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:09:40 [error] 12909#0: *1064 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /info.php HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:09:40 [error] 12909#0: *1064 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:24:44 [error] 12909#0: *1313 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:24:53 [error] 12909#0: *1313 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"

当我通过 TCP 模式在端口 9000 上运行 php-fpm 时,我运行了“netstat | grep 9000”并注意到一些不寻常的事情...... (为方便阅读,此处粘贴部分输出)

tcp        9      0 localhost:9000          localhost:36094         CLOSE_WAIT  14269/php5-fpm  
tcp        0      0 localhost:46664         localhost:9000          FIN_WAIT2   -               
tcp     1257      0 localhost:9000          localhost:36135         CLOSE_WAIT  -               
tcp     1257      0 localhost:9000          localhost:36125         CLOSE_WAIT  -               
tcp        9      0 localhost:9000          localhost:36102         CLOSE_WAIT  14268/php5-fpm  
tcp        0      0 localhost:46662         localhost:9000          FIN_WAIT2   -               
tcp      745      0 localhost:9000          localhost:46644         CLOSE_WAIT  -               
tcp        0      0 localhost:46658         localhost:9000          FIN_WAIT2   -               
tcp     1265      0 localhost:9000          localhost:46607         CLOSE_WAIT  -               
tcp        0      0 localhost:46672         localhost:9000          ESTABLISHED 12909/nginx: worker
tcp     1257      0 localhost:9000          localhost:36119         CLOSE_WAIT  -               
tcp     1265      0 localhost:9000          localhost:46613         CLOSE_WAIT  -               
tcp        0      0 localhost:46646         localhost:9000          FIN_WAIT2   -               
tcp     1257      0 localhost:9000          localhost:36137         CLOSE_WAIT  -               
tcp        0      0 localhost:46670         localhost:9000          ESTABLISHED 12909/nginx: worker
tcp     1265      0 localhost:9000          localhost:46619         CLOSE_WAIT  -               
tcp     1336      0 localhost:9000          localhost:46668         ESTABLISHED -               
tcp        0      0 localhost:46648         localhost:9000          FIN_WAIT2   -               
tcp     1336      0 localhost:9000          localhost:46670         ESTABLISHED -               
tcp        9      0 localhost:9000          localhost:36108         CLOSE_WAIT  14274/php5-fpm  
tcp     1336      0 localhost:9000          localhost:46684         ESTABLISHED -               
tcp        0      0 localhost:46674         localhost:9000          ESTABLISHED 12909/nginx: worker
tcp     1336      0 localhost:9000          localhost:46666         ESTABLISHED -               
tcp     1257      0 localhost:9000          localhost:46648         CLOSE_WAIT  -               
tcp     1336      0 localhost:9000          localhost:46678         ESTABLISHED -               
tcp        0      0 localhost:46668         localhost:9000          ESTABLISHED 12909/nginx: wo             

有很多“CLOSE_WAIT”和“FIN_WAIT2”对,如下所示(在上面的输出中):

tcp     1337      0 localhost:9000          localhost:46680         CLOSE_WAIT  -               
tcp        0      0 localhost:46680         localhost:9000          FIN_WAIT2   -

请注意上面的端口 46680。

我启用了mysql慢查询错误日志,但是没有作用。

截至目前,每分钟通过 cronjob 重新启动 php5-fpm(参见下面的命令)保持一切“顺利”运行,但我讨厌拼凑并想解决这个问题......

1 * * * * service php5-fpm restart > /dev/null

我在 Google 上搜索了很久,但还是没有找到任何帮助。正如前面提到的,这是 LAN 中的测试服务器,CPU 负载从未超过 0.10,内存使用率也低于 25%(系统有 2GB RAM 和安装了 ubuntu 服务器)所以如果你觉得帮我解决问题很费时间,请至少给我一点提示。

提前感谢你的帮助。

-拉胡尔

(注意 - 这是转载自 -http://forum.nginx.org/read.php?11,127694

更新:我找到了答案,发布在下面。

答案1

我在 nginx 论坛上的帖子中找到了答案 -http://forum.nginx.org/read.php?2,127854

就我而言,答案是设置:

request_terminate_timeout=30s

在 php-fpm 配置中(通常/etc/php5/fpm/php-fpm.conf

请注意,您也可以使用 30 秒以外的值。

我用它来匹配主文件中的值php.ini

max_execution_time = 30

谢谢大家。:-)

答案2

以下是它解决我的问题的方法:

对 /etc/nginx/nginx.conf 中的 http { 部分进行以下更改

proxy_connect_timeout  600s;
proxy_send_timeout  600s;
proxy_read_timeout  600s;
fastcgi_send_timeout 600s;
fastcgi_read_timeout 600s;

然后重启 nginx

/etc/init.d/nginx 重启

答案3

如果您使用的是 php 5.3,请增加积压。

如果您正在使用 php 5.2,请反向移植补丁以将积压大小从 128 增加。

另外,使用 unix 套接字而不是 TCP 套接字。unix:/tmp/php5-cgi.sock(或相关路径)

答案4

就我的情况而言(相同的 nginx 错误消息),一些有问题的 php 脚本没有结束执行并且在等待某些东西,导致 nginx 没有更多的 php5-fpm 子项可供选择。

使固定:

  1. 添加执行时间限制其他人在这篇文章中提到了这一点。 request_terminate_timeout=30s
  2. 增加孩子的数量。一切都很顺利。 pm.max_spare_servers=16 pm.min_spare_servers=2

现在一切都顺利了。

相关内容