HAProxy 停止接受连接

HAProxy 停止接受连接

几个月来,我一直使用 HAProxy 来平衡我的应用服务器,没有出现任何问题。最近,一些流量高峰促使我设置 maxconn 参数来限制与后端服务器的连接速率。

它运行了几个小时,然后似乎停止接受任何连接。

我查看了我的系统资源图表,loadavg 和 RAM 使用率似乎都在控制范围内,在冻结之前没有出现大的峰值。编辑:在冻结之前,Loadavg 似乎确实从大约 0.13 飙升到 1.0。此系统上有 4 个核心,我仅将 HAProxy 作为单个进程运行 - 当我在冻结后在 htop 中查看它时,该 CPU 固定在 100%。

# HA-Proxy version 1.4.16 2011/08/04
global  
        log 127.0.0.1   local0
        log 127.0.0.1   local1 notice
        #log 127.0.0.1   local1 debug
        #log loghost    local0 info
        maxconn 4096
        #chroot /usr/share/haproxy
        user haproxy
        group haproxy
        daemon
        #debug
        #quiet

defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        retries 3
        option forwardfor
        option redispatch
        maxconn 2000
        contimeout      5000
        clitimeout      50000
        srvtimeout      50000

listen  platform-cache 0.0.0.0:80
        mode http
        maxconn 10000
        option abortonclose
        #balance uri
        hash-type consistent
        balance hdr(Host)
        server varnish1 10.176.129.245 weight 20 maxconn 64 check
        server varnish2 10.176.129.29 weight 40  maxconn 64 check
        contimeout 60000

# HTTP response : 'HTTP/1.0 200 OK'
listen http_health_check 0.0.0.0:60001
        mode health
        option httpchk

listen  stats_for_scout 127.0.0.1:8081
        mode http
        stats uri /stats

listen  public_stats :8080
        mode http
        stats uri /stats
        stats realm   Haproxy\ Statistics
        stats auth    ****************

以下是从一次重启到下次冻结并在几个小时后重启的日志:

Feb  5 08:57:15 localhost haproxy[31950]: Proxy platform-cache started.
Feb  5 08:57:15 localhost haproxy[31950]: Proxy http_health_check started.Feb  5 08:57:15 localhost haproxy[31950]: Proxy stats_for_scout started.
Feb  5 08:57:15 localhost haproxy[31950]: Proxy public_stats started.
Feb  5 09:43:47 localhost haproxy[31951]: Pausing proxy platform-cache.
Feb  5 09:43:47 localhost haproxy[31951]: Pausing proxy http_health_check.
Feb  5 09:43:47 localhost haproxy[31951]: Pausing proxy stats_for_scout.
Feb  5 09:43:47 localhost haproxy[31951]: Pausing proxy public_stats.
Feb  5 09:43:47 localhost haproxy[32746]: Proxy platform-cache started.
Feb  5 09:43:47 localhost haproxy[32746]: Proxy http_health_check started.
Feb  5 09:43:47 localhost haproxy[32746]: Proxy stats_for_scout started.
Feb  5 09:43:47 localhost haproxy[32746]: Proxy public_stats started.
Feb  5 09:43:47 localhost haproxy[31951]: Stopping proxy platform-cache in 0 ms.
Feb  5 09:43:47 localhost haproxy[31951]: Stopping proxy http_health_check in 0 ms.
Feb  5 09:43:47 localhost haproxy[31951]: Stopping proxy stats_for_scout in 0 ms.
Feb  5 09:43:47 localhost haproxy[31951]: Stopping proxy public_stats in 0 ms.
Feb  5 09:43:47 localhost haproxy[31951]: Proxy platform-cache stopped (FE: 32540 conns, BE: 30334 conns).
Feb  5 09:43:47 localhost haproxy[31951]: Proxy http_health_check stopped (FE: 0 conns, BE: 0 conns).
Feb  5 09:43:47 localhost haproxy[31951]: Proxy stats_for_scout stopped (FE: 16 conns, BE: 0 conns).
Feb  5 09:43:47 localhost haproxy[31951]: Proxy public_stats stopped (FE: 4 conns, BE: 2 conns).
Feb  5 17:52:27 localhost haproxy[26610]: Proxy platform-cache started.
Feb  5 17:52:27 localhost haproxy[26610]: Proxy http_health_check started.
Feb  5 17:52:27 localhost haproxy[26610]: Proxy stats_for_scout started.
Feb  5 17:52:27 localhost haproxy[26610]: Proxy public_stats started.
~                                                                            

原始配置已运行一年多,没有任何问题

global 
        log 127.0.0.1   local0
        log 127.0.0.1   local1 notice
        #log 127.0.0.1   local1 debug
        #log loghost    local0 info
        maxconn 4096
        #chroot /usr/share/haproxy
        user haproxy
        group haproxy
        daemon
        #debug
        #quiet

defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        retries 3
        option forwardfor
        option redispatch
        maxconn 2000
        contimeout      5000
        clitimeout      50000
        srvtimeout      50000

listen  platform-cache 0.0.0.0:80
        mode http
        #balance uri
        hash-type consistent
        balance hdr(Host)
        server varnish1 10.176.129.245 weight 20 check
        server varnish2 10.176.129.29 weight 40 check

# HTTP response : 'HTTP/1.0 200 OK'
listen http_health_check 0.0.0.0:60001
        mode health
        option httpchk

listen  stats_for_scout 127.0.0.1:8081
        mode http
        stats uri /stats

listen  public_stats :8080
        mode http
        stats uri /stats
        stats realm   Haproxy\ Statistics
        stats auth    *******

感谢您的帮助!

答案1

我刚刚遇到了我认为是相同或类似的问题。我正在运行 haproxy,突然它停止接受新连接。我尝试将最大连接数从 32k 增加到 65k,当我重新启动时,情况看起来好多了,但随后会在 haproxy.cfg 文件中的最大连接数限制之前停止接受新连接。我可以在 haproxy 统计网页中看到更改生效。

然后我查看了 /proc//limits 并看到了以下内容:... 最大打开文件数 4026 4026 个文件...

然后我检查了 /proc//fd,发现当 haproxy 停止时,我打开了大约那么多文件。所以我认为问题在于底层 unix 限制,而不是 haproxy。我增加了进程限制,到目前为止,在我的测试中看起来更好。

如果您的问题是由相同的潜在问题引起的,我希望这会有所帮助。

答案2

如果你达到了配置的限制,那应该是正常的。你可以从 haproxy 中读取此信息文档

maxconn <number>
  Sets the maximum per-process number of concurrent connections to <number>. It
  is equivalent to the command-line argument "-n". Proxies will stop accepting
  connections when this limit is reached.

相关内容