运行 Varnish 4 时,我偶尔会收到 503 错误,但我无法确定原因。我尝试curl
直接查看后端,它每次都会返回响应,因此问题似乎不在这里。下面是我的 varnishlog 中的条目,但它并没有告诉我太多信息。我通过运行以下命令获取此日志
varnishlog -q "RespStatus >= 500 or BerespStatus >= 500"
以下是日志条目
* << Request >> 524356
- 524356 Begin c req 524355 rxreq
- 524356 Timestamp c Start: 1421453220.095464 0.000000 0.000000
- 524356 Timestamp c Req: 1421453220.095464 0.000000 0.000000
- 524356 ReqStart c 173.255.199.52 38213
- 524356 ReqMethod c GET
- 524356 ReqURL c /asd/asdasd
- 524356 ReqProtocol c HTTP/1.0
- 524356 ReqHeader c X-Real-IP: 206.169.79.151
- 524356 ReqHeader c Host: www.CENSORED.com
- 524356 ReqHeader c X-Forwarded-For: 206.169.79.151
- 524356 ReqHeader c X-Time-Server-Received: 1421453197
- 524356 ReqHeader c Connection: close
- 524356 ReqHeader c Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
- 524356 ReqHeader c User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36
- 524356 ReqHeader c Accept-Encoding: gzip, deflate, sdch
- 524356 ReqHeader c Accept-Language: en-US,en;q=0.8
- 524356 ReqHeader c Cookie: wp-settings-1=deleted%3Dundefined%26ed_size%3D1305%26hidetb%3D1%26libraryContent%3Dbrowse%26editor%3Dtinymce%26imgsize%3Dlarge%26wplink%3D1%26urlbutton%3Dfile%26pos
ts_list_mode%3Dlist; wp-settings-time-1=1421309968; wp-saving-post=5-saved; wordpr
- 524356 ReqUnset c X-Forwarded-For: 206.169.79.151
- 524356 ReqHeader c X-Forwarded-For: 206.169.79.151, 173.255.199.52
- 524356 VCL_call c RECV
- 524356 ReqUnset c Cookie: wp-settings-1=deleted%3Dundefined%26ed_size%3D1305%26hidetb%3D1%26libraryContent%3Dbrowse%26editor%3Dtinymce%26imgsize%3Dlarge%26wplink%3D1%26urlbutton%3Dfile%26pos
ts_list_mode%3Dlist; wp-settings-time-1=1421309968; wp-saving-post=5-saved; wordpr
- 524356 ReqHeader c Cookie: wp-settings-1=deleted%3Dundefined%26ed_size%3D1305%26hidetb%3D1%26libraryContent%3Dbrowse%26editor%3Dtinymce%26imgsize%3Dlarge%26wplink%3D1%26urlbutton%3Dfile%26pos
ts_list_mode%3Dlist; wp-settings-time-1=1421309968; wp-saving-post=5-saved; wordpr
- 524356 ReqUnset c Cookie: wp-settings-1=deleted%3Dundefined%26ed_size%3D1305%26hidetb%3D1%26libraryContent%3Dbrowse%26editor%3Dtinymce%26imgsize%3Dlarge%26wplink%3D1%26urlbutton%3Dfile%26pos
ts_list_mode%3Dlist; wp-settings-time-1=1421309968; wp-saving-post=5-saved; wordpr
- 524356 ReqHeader c Cookie: wp-settings-1=deleted%3Dundefined%26ed_size%3D1305%26hidetb%3D1%26libraryContent%3Dbrowse%26editor%3Dtinymce%26imgsize%3Dlarge%26wplink%3D1%26urlbutton%3Dfile%26pos
ts_list_mode%3Dlist; wp-settings-time-1=1421309968; wp-saving-post=5-saved; wordpr
- 524356 VCL_return c pass
- 524356 VCL_call c HASH
- 524356 VCL_return c lookup
- 524356 VCL_call c PASS
- 524356 VCL_return c fetch
- 524356 Link c bereq 524357 pass
- 524356 Timestamp c Fetch: 1421453222.583956 2.488493 2.488493
- 524356 Timestamp c Process: 1421453222.583983 2.488520 0.000027
- 524356 RespHeader c Date: Sat, 17 Jan 2015 00:07:02 GMT
- 524356 RespHeader c Server: Varnish
- 524356 RespHeader c X-Varnish: 524356
- 524356 RespProtocol c HTTP/1.1
- 524356 RespStatus c 503
- 524356 RespReason c Service Unavailable
- 524356 RespReason c Service Unavailable
- 524356 VCL_call c SYNTH
- 524356 RespHeader c Content-Type: text/html; charset=utf-8
- 524356 RespHeader c Retry-After: 5
- 524356 VCL_return c deliver
- 524356 RespHeader c Content-Length: 415
- 524356 Debug c "RES_MODE 2"
- 524356 RespHeader c Connection: close
- 524356 Timestamp c Resp: 1421453222.584031 2.488567 0.000048
- 524356 ReqAcct c 983 0 983 205 415 620
- 524356 End c
我的default.vcl
vcl 4.0;
import std;
backend default {
.host = "localhost";
.port = "81";
.connect_timeout = 600s;
.first_byte_timeout = 600s;
.between_bytes_timeout = 600s;
}
acl purge {
# Only allow purges coming from localhost
"127.0.0.1";
"localhost";
}
编辑:varnishstat 抛出503
错误。您可以看到这种fetch_failed
情况已经发生过多次。
MAIN.uptime 787 1.00 1.00 1.00 1.00
MAIN.sess_conn 11076 10.99 14.00 8.30 8.17
MAIN.client_req 11089 11.98 14.00 8.40 8.23
MAIN.cache_hit 1488 1.00 1.00 1.32 1.41
MAIN.cache_hitpass 2 0.00 . 0.00 0.00
MAIN.cache_miss 3186 1.00 4.00 1.27 1.29
MAIN.backend_conn 9585 9.99 12.00 7.01 6.76
MAIN.backend_reuse 17 1.00 . 0.16 0.12
MAIN.backend_recycle 17 1.00 . 0.16 0.12
MAIN.fetch_head 2 0.00 . 0.06 0.06
MAIN.fetch_length 346 2.00 . 0.39 0.35
MAIN.fetch_chunked 4497 3.99 5.00 2.27 2.06
MAIN.fetch_close 4756 4.99 6.00 4.36 4.35
MAIN.fetch_failed 180 0.00 . 0.10 0.12
MAIN.pools 4 0.00 . 4.00 4.00
MAIN.threads 54 0.00 . 54.00 54.00
MAIN.threads_limited 1 0.00 . 0.00 0.00
MAIN.threads_created 102 0.00 . 0.00 0.00
MAIN.threads_destroyed 48 0.00 . 0.00 0.00
MAIN.busy_sleep 5 0.00 . 0.00 0.00
MAIN.busy_wakeup 5 0.00 . 0.00 0.00
MAIN.sess_queued 50 0.00 . 0.00 0.00
MAIN.n_object 2873 1.00 . 2866.32 2865.00
MAIN.n_objectcore 2891 1.00 . 2885.40 2884.22
MAIN.n_objecthead 2990 1.00 . 2981.80 2980.00
MAIN.n_backend 1 0.00 . 1.00 1.00
MAIN.n_expired 135 0.00 . 133.18 132.78
MAIN.s_sess 11076 10.99 14.00 8.30 8.17
MAIN.s_req 11089 11.98 14.00 8.40 8.23
MAIN.s_pass 6415 9.99 8.00 5.81 5.52
MAIN.s_fetch 9601 10.99 12.00 7.09 6.82
MAIN.s_synth 112 0.00 . 0.00 0.00
MAIN.s_req_hdrbytes 3346583 3583.50 4252.00 2308.12 2251.15
MAIN.s_req_bodybytes 804869 1072.65 1022.00 202.52 162.86
MAIN.s_resp_hdrbytes 2737544 3615.46 3478.00 1942.75 1857.93
MAIN.s_resp_bodybytes 66853226 14104.28 84946.00 37816.96 38189.12
MAIN.sess_closed 9442 6.99 11.00 6.96 7.05
MAIN.backend_req 9594 10.99 12.00 7.17 6.87
守护进程选项
DAEMON_OPTS="-a :80 \
-f /etc/varnish/default.vcl \
-T 127.0.0.1:6082 \
-u varnish -g varnish \
-t 120 \
-n $INSTANCE \
-s malloc,10G \
-p thread_pools=4 \
-p thread_pool_min=5 \
-p thread_pool_max=500 \
-p thread_pool_timeout=300 \
-S /etc/varnish/secret"
有什么建议么?
答案1
当发生 503 错误时,你的 varnish 处理了多少个请求/秒和总连接数?
我遇到过类似的情况,当后端运行正常时,它会向用户返回 503 错误。
varnishlog 会显示后端有问题,但事实上是健康的。
经过一番挖掘后,我发现问题出在最大文件描述符上。
我必须增加系统的最大文件描述符,以便 varnish 能够正确处理所有请求。
此后,所有 503 错误都停止了,并且所有后端始终显示为健康。
答案2
请增加所有内核限制:最大打开文件数、最大进程文件数等。如果您使用的是 Linux 基础版 radhat(如 centos),请编辑 2 个文件:/etc/security/limit.conf
答案3
你不会在 varnish 日志中看到这个,varnish 503 意味着没有后端,所以你需要监控/var/log/消息 和Apache、Nginx、PHP当时的错误日志。后端在请求期间遇到错误,并且 varnish 无法进一步处理请求。所以 varnish 不知道那里发生了什么,而且它也不必知道,这不是他的事。
顺便说一下,尝试从 Varnish 推荐的设置开始:
-p thread_pool_min=200 \
-p thread_pool_max=4000 \
-p thread_pool_add_delay=2 \
-p http_req_size=64000 \
-p cli_timeout=25 \
-p cli_buffer=36384 \
-p session_linger=100 \
-s malloc,(YOURMEMORY-20%)G
并且此行是 sysctl.conf 中必须的:
fs.file-max = 700000
还有这些:
echo "* soft nofile 200000" >> /etc/security/limits.conf
echo "* hard nofile 700000" >> /etc/security/limits.conf