haproxy 会话速率低于单台服务器 qps

我在 ubuntu 16.04 上设置了一个 haproxy(1.6.3) 来平衡两个 Web 服务器的负载。根据我之前的测试,Web 服务器可以处理超过 20k 个请求/秒。Web 服务器经过了测试工作2,并且我在日志中验证了请求数。但是,在 Web 服务器前面使用 haproxy 时,似乎每秒的请求数被限制在大约 6k 个请求/秒。haproxy 配置有什么问题吗?


    log /dev/log    local0
    log /dev/log    local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    maxconn     102400
    user haproxy
    group haproxy

    # Default SSL material locations
    ca-base /etc/ssl/certs
    crt-base /etc/ssl/private

    # Default ciphers to use on SSL-enabled listening sockets.
    # For more information, see ciphers(1SSL). This list is from:
    # https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
    ssl-default-bind-options no-sslv3

    log    global
    mode    http
    option    httplog
    option    dontlognull
    # https://serverfault.com/questions/504308/by-what-criteria-do-you-tune-timeouts-in-ha-proxy-config
    timeout connect 5000
    timeout check 5000
    timeout client  30000
    timeout server  30000
    timeout tunnel  3600s
    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 403 /etc/haproxy/errors/403.http
    errorfile 408 /etc/haproxy/errors/408.http
    errorfile 500 /etc/haproxy/errors/500.http
    errorfile 502 /etc/haproxy/errors/502.http
    errorfile 503 /etc/haproxy/errors/503.http
    errorfile 504 /etc/haproxy/errors/504.http

listen web-test
    maxconn 40000  # the default is 2000
    mode http
    bind *:80
    balance roundrobin
    option forwardfor
    option http-keep-alive  # connections will no longer be closed after each request
    server test1 SERVER1:80 check maxconn 20000
    server test2 SERVER2:80 check maxconn 20000

如果使用 3 个实例运行 wrk,我会得到大致相同的结果:

./wrk -t4 -c100 -d30s -R4000 http://HAPROXY/
Running 30s test @ http://HAPROXY/
  4 threads and 100 connections
  Thread calibration: mean lat.: 1577.987ms, rate sampling interval: 7139ms
  Thread calibration: mean lat.: 1583.182ms, rate sampling interval: 7180ms
  Thread calibration: mean lat.: 1587.795ms, rate sampling interval: 7167ms
  Thread calibration: mean lat.: 1583.128ms, rate sampling interval: 7147ms
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     8.98s     2.67s   13.93s    58.43%
    Req/Sec   516.75     11.28   529.00     87.50%
  64916 requests in 30.00s, 51.69MB read
Requests/sec:   2163.75    # Requests/sec decrease slightly
Transfer/sec:      1.72MB

来自 haproxy 的统计数据: 在此处输入图片描述

如果使用 1 个实例运行 wrk 到没有 haproxy 的其中一个 Web 服务器:

./wrk -t4 -c100 -d30s -R4000 http://SERVER1
Running 30s test @ http://SERVER1
  4 threads and 100 connections
  Thread calibration: mean lat.: 1.282ms, rate sampling interval: 10ms
  Thread calibration: mean lat.: 1.363ms, rate sampling interval: 10ms
  Thread calibration: mean lat.: 1.380ms, rate sampling interval: 10ms
  Thread calibration: mean lat.: 1.351ms, rate sampling interval: 10ms
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.41ms    0.97ms  22.42ms   96.48%
    Req/Sec     1.05k   174.27     2.89k    86.01%
  119809 requests in 30.00s, 98.15MB read
Requests/sec:   3993.36     # Requests/sec is about 4k
Transfer/sec:      3.27MB

HA-Proxy version 1.6.3 2015/12/25
Copyright 2000-2015 Willy Tarreau <[email protected]>

Build options :
  TARGET  = linux2628
  CPU     = generic
  CC      = gcc
  CFLAGS  = -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Encrypted password support via crypt(3): yes
Built with zlib version : 1.2.8
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with OpenSSL version : OpenSSL 1.0.2g-fips  1 Mar 2016
Running on OpenSSL version : OpenSSL 1.0.2g  1 Mar 2016
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.38 2015-11-23
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with Lua version : Lua 5.3.1
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

我知道 ab 不是一种非常精确的测试方法,但我认为 haproxy 应该比单个节点给出更好的结果。然而,结果却恰恰相反。


ab -n 10000 -c 10 http://HAPROXY/
Requests per second:    4276.18 [#/sec] (mean)

ab 测试 SERVER1

ab -n 10000 -c 10 http://SERVER1/
Requests per second:    9392.66 [#/sec] (mean)

ab 测试 SERVER2

ab -n 10000 -c 10 http://SERVER2/
Requests per second:    8513.28 [#/sec] (mean)

VM 是单核的,因此无需使用nb进程。另外,我监控了 CPU 和内存使用情况,所有虚拟机的 CPU 使用率都低于 30%,内存使用率也低于 20%。haproxy 配置或我的系统配置肯定有问题。

我现在从 haproxy 和单台服务器获得的性能大致相同,问题是存在默认的 maxconn2000我错过了这一部分。但是,我期望在拥有更多后端服务器时性能会更好,但我仍然无法实现这一点。

使用相同的配置,我现在升级到 haproxy 1.8.3,但并没有太大的区别。


Haproxy 默认是单线程的,要为每个核心生成一个进程,请使用全局配置中的 nbproc 选项(手册中不鼓励这样做,因为“很难排除故障”)

  1. 以守护进程模式运行 haproxy
  2. 在全局部分输入 nbproc 设置,例如 4 个处理器生成 4 个守护进程
                  nbproc 4
                  cpu-map 1 0
                  cpu-map 2 1
                  cpu-map 3 2
                  cpu-map 4 3

因此守护进程 1 在 CPU 0 上运行,依此类推。


frontend http
   bind-process 1
frontend https
   bind ssl crt /etc/yourdomain.pem
   bind-process 2 3 4

https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#daemon https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#3.1-cpu-map
