我在 AWS 中为 Haproxy 设置了 Keepalived 的主/从,并使用 EIP 作为 VIP。偶尔备份服务器会触发故障转移,但主节点是健康的。以下是相应的日志。
备份服务器
Oct 10 04:14:32 Prod-WebAccessLb2 Keepalived_vrrp[2271]: VRRP_Instance(ProdWebCluster) Transition to MASTER STATE
Oct 10 04:14:33 Prod-WebAccessLb2 Keepalived_vrrp[2271]: VRRP_Instance(ProdWebCluster) Entering MASTER STATE
Oct 10 04:14:33 Prod-WebAccessLb2 Keepalived_vrrp[2271]: Opening script file /etc/keepalived/master.sh
Oct 10 04:14:34 Prod-WebAccessLb2 Keepalived_vrrp[2271]: VRRP_Instance(ProdWebCluster) Received advert with higher priority 200, ours 100
Oct 10 04:14:34 Prod-WebAccessLb2 Keepalived_vrrp[2271]: VRRP_Instance(ProdWebCluster) Entering BACKUP STATE
主服务器
Oct 10 04:14:35 Prod-WebAccessLb1 Keepalived_vrrp[1311]: VRRP_Instance(ProdWebCluster) Received advert with lower priority 100, ours 200, forcing new election
Oct 10 04:14:35 Prod-WebAccessLb1 Keepalived_vrrp[1311]: VRRP_Instance(ProdWebCluster) Received advert with lower priority 100, ours 200, forcing new election
因此,查看日志后,我们可以说,故障转移后,MASTER 节点立即触发故障回复,但它不会运行 master.sh,并且 VIP 处于悬而未决的状态。
以下是主配置
vrrp_script chk_haproxy {
script "/bin/pidof haproxy"
interval 1
}
vrrp_instance ProdWebCluster {
debug 2
interface eth0
state MASTER
virtual_router_id 33
priority 151
unicast_src_ip 10.186.2.10
unicast_peer {
10.186.6.10
}
authentication {
auth_type PASS
auth_pass xxx
}
track_script {
chk_haproxy
}
备份服务器配置
vrrp_instance ProdWebCluster {
debug 2
interface eth0
state BACKUP
virtual_router_id 33
priority 100
unicast_src_ip 10.186.6.10
unicast_peer {
10.186.2.10
}
authentication {
auth_type PASS
auth_pass xxxx
}
track_script {
chk_haproxy
}
有人能告诉我为什么它首先触发故障转移吗?以及为什么在故障回复期间它不运行 master.sh?
注意:当我从 Backup/Master 手动运行 master.sh 脚本时,它按照预期的方式工作。