我在 ubuntu 16.04 上设置了一个 haproxy(1.6.3) 来平衡两个 Web 服务器的负载。根据我之前的测试,Web 服务器可以处理超过 20k 个请求/秒。Web 服务器经过了测试工作2,并且我在日志中验证了请求数。但是,在 Web 服务器前面使用 haproxy 时,似乎每秒的请求数被限制在大约 6k 个请求/秒。haproxy 配置有什么问题吗?
haproxy配置文件
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
maxconn 102400
user haproxy
group haproxy
daemon
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
# Default ciphers to use on SSL-enabled listening sockets.
# For more information, see ciphers(1SSL). This list is from:
# https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS
ssl-default-bind-options no-sslv3
defaults
log global
mode http
option httplog
option dontlognull
# https://serverfault.com/questions/504308/by-what-criteria-do-you-tune-timeouts-in-ha-proxy-config
timeout connect 5000
timeout check 5000
timeout client 30000
timeout server 30000
timeout tunnel 3600s
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
listen web-test
maxconn 40000 # the default is 2000
mode http
bind *:80
balance roundrobin
option forwardfor
option http-keep-alive # connections will no longer be closed after each request
server test1 SERVER1:80 check maxconn 20000
server test2 SERVER2:80 check maxconn 20000
如果使用 3 个实例运行 wrk,我会得到大致相同的结果:
./wrk -t4 -c100 -d30s -R4000 http://HAPROXY/
Running 30s test @ http://HAPROXY/
4 threads and 100 connections
Thread calibration: mean lat.: 1577.987ms, rate sampling interval: 7139ms
Thread calibration: mean lat.: 1583.182ms, rate sampling interval: 7180ms
Thread calibration: mean lat.: 1587.795ms, rate sampling interval: 7167ms
Thread calibration: mean lat.: 1583.128ms, rate sampling interval: 7147ms
Thread Stats Avg Stdev Max +/- Stdev
Latency 8.98s 2.67s 13.93s 58.43%
Req/Sec 516.75 11.28 529.00 87.50%
64916 requests in 30.00s, 51.69MB read
Requests/sec: 2163.75 # Requests/sec decrease slightly
Transfer/sec: 1.72MB
如果使用 1 个实例运行 wrk 到没有 haproxy 的其中一个 Web 服务器:
./wrk -t4 -c100 -d30s -R4000 http://SERVER1
Running 30s test @ http://SERVER1
4 threads and 100 connections
Thread calibration: mean lat.: 1.282ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 1.363ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 1.380ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 1.351ms, rate sampling interval: 10ms
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.41ms 0.97ms 22.42ms 96.48%
Req/Sec 1.05k 174.27 2.89k 86.01%
119809 requests in 30.00s, 98.15MB read
Requests/sec: 3993.36 # Requests/sec is about 4k
Transfer/sec: 3.27MB
haproxy -vv HA-Proxy 版本 1.6.3 2015/12/25 版权所有 2000-2015 Willy Tarreau
Build options :
TARGET = linux2628
CPU = generic
CC = gcc
CFLAGS = -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2
OPTIONS = USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1
Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
Encrypted password support via crypt(3): yes
Built with zlib version : 1.2.8
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with OpenSSL version : OpenSSL 1.0.2g-fips 1 Mar 2016
Running on OpenSSL version : OpenSSL 1.0.2g 1 Mar 2016
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.38 2015-11-23
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with Lua version : Lua 5.3.1
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.
HA-Proxy version 1.6.3 2015/12/25
Copyright 2000-2015 Willy Tarreau <[email protected]>
Build options :
TARGET = linux2628
CPU = generic
CC = gcc
CFLAGS = -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2
OPTIONS = USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1
Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
Encrypted password support via crypt(3): yes
Built with zlib version : 1.2.8
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with OpenSSL version : OpenSSL 1.0.2g-fips 1 Mar 2016
Running on OpenSSL version : OpenSSL 1.0.2g 1 Mar 2016
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.38 2015-11-23
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with Lua version : Lua 5.3.1
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.
我知道 ab 不是一种非常精确的测试方法,但我认为 haproxy 应该比单个节点给出更好的结果。然而,结果却恰恰相反。
ab 测试 HAPROXY
ab -n 10000 -c 10 http://HAPROXY/
Requests per second: 4276.18 [#/sec] (mean)
ab 测试 SERVER1
ab -n 10000 -c 10 http://SERVER1/
Requests per second: 9392.66 [#/sec] (mean)
ab 测试 SERVER2
ab -n 10000 -c 10 http://SERVER2/
Requests per second: 8513.28 [#/sec] (mean)
VM 是单核的,因此无需使用nb进程。另外,我监控了 CPU 和内存使用情况,所有虚拟机的 CPU 使用率都低于 30%,内存使用率也低于 20%。haproxy 配置或我的系统配置肯定有问题。
我现在从 haproxy 和单台服务器获得的性能大致相同,问题是存在默认的 maxconn2000在听我错过了这一部分。但是,我期望在拥有更多后端服务器时性能会更好,但我仍然无法实现这一点。
使用相同的配置,我现在升级到 haproxy 1.8.3,但并没有太大的区别。
答案1
Haproxy 默认是单线程的,要为每个核心生成一个进程,请使用全局配置中的 nbproc 选项(手册中不鼓励这样做,因为“很难排除故障”)
- 以守护进程模式运行 haproxy
- 在全局部分输入 nbproc 设置,例如 4 个处理器生成 4 个守护进程
global nbproc 4 cpu-map 1 0 cpu-map 2 1 cpu-map 3 2 cpu-map 4 3
因此守护进程 1 在 CPU 0 上运行,依此类推。
然后你可以明确地将这些进程绑定到端点,例如
frontend http
bind 0.0.0.0:80
bind-process 1
frontend https
bind 0.0.0.0:443 ssl crt /etc/yourdomain.pem
bind-process 2 3 4
https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#daemon https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#3.1-cpu-map