varnish 后端连接失败

2024-5-29 • tag-icon

使用：varnish-3.0.4

有人能建议后端连接失败的潜在原因吗，这通常发生在 N-Worker_thread 超过默认值 100 worker_thread 时（不一定一直发生）？

在几种情况之一中，尝试在高峰期创建 491 个线程时无法连接到后端。而后端服务器没有处于负载状态或任何其他状态。为了缩小问题范围，它不是后端服务器的问题，因为它运行良好且可访问。

backend_unhealthy            0         0.00 Backend conn. not attempted
backend_busy                 0         0.00 Backend conn. too many

据我了解，“后端连接失败”与配置相反 1）线程最大值为 1000 * 2 [池]，2）服务器负载低于 1

理论上它应该能够处理那么多峰值，并且我不明白为什么后端会在这里失败。

[注意，由于使用需求，设计缓存时间最多为1秒到5秒]

n_worker_thread = 100 ，一切正常

n_worker_thread = 491,8 backend_connection 失败。

清漆

thread_pool_add_delay       2 [milliseconds]
thread_pool_add_threshold   2 [requests]
thread_pool_fail_delay      200 [milliseconds]
thread_pool_max             1000 [threads]
thread_pool_min             50 [threads]
thread_pool_purge_delay     1000 [milliseconds]
thread_pool_stack           unlimited [bytes]
thread_pool_timeout         120 [seconds]
thread_pool_workspace       65536 [bytes]
thread_pools                2 [pools]
thread_stats_rate           10 [requests]

varnishstat

32+03:45:05
Hitrate ratio:        2        2        2
Hitrate avg:     0.9404   0.9404   0.9404


backend_conn           4516262         1.63 Backend conn. success
backend_unhealthy            0         0.00 Backend conn. not attempted
backend_busy                 0         0.00 Backend conn. too many
backend_fail              9562         0.00 Backend conn. failures
backend_reuse         67350518        24.24 Backend conn. reuses
backend_toolate         361647         0.13 Backend conn. was closed
backend_recycle       67715544        24.38 Backend conn. recycles
backend_retry             5133         0.00 Backend conn. retry
n_backend                    5          .   N backends
backend_req           71855086        25.87 Backend requests made
LCK.backend.creat              5         0.00 Created locks
LCK.backend.destroy            0         0.00 Destroyed locks
LCK.backend.locks      149007648        53.64 Lock Operations
LCK.backend.colls              0         0.00 Collisions

答案1

嗨，Shane，谢谢你的回复，

刚刚弄清楚了后端通信问题不是由于任何配置失败而是由于后端和 varnish 之间的硬件切换造成的。

这很难分析，因为主交换机可以正常工作，而辅助交换机在故障转移通信时会引起问题。

这充分说明了在没有其他后端 n_worker 繁忙/太多/或超出队列的情况下后端 conn 失败的可能性不大。

希望这对将来的某人有用。

答案1

相关内容