当 MASTER 节点健康时，Keepalived 触发故障转移

2024-5-31 • tag-icon

我在 AWS 中为 Haproxy 设置了 Keepalived 的主/从，并使用 EIP 作为 VIP。偶尔备份服务器会触发故障转移，但主节点是健康的。以下是相应的日志。

备份服务器

Oct 10 04:14:32 Prod-WebAccessLb2 Keepalived_vrrp[2271]: VRRP_Instance(ProdWebCluster) Transition to MASTER STATE
Oct 10 04:14:33 Prod-WebAccessLb2 Keepalived_vrrp[2271]: VRRP_Instance(ProdWebCluster) Entering MASTER STATE
Oct 10 04:14:33 Prod-WebAccessLb2 Keepalived_vrrp[2271]: Opening script file /etc/keepalived/master.sh
Oct 10 04:14:34 Prod-WebAccessLb2 Keepalived_vrrp[2271]: VRRP_Instance(ProdWebCluster) Received advert with higher priority 200, ours 100
Oct 10 04:14:34 Prod-WebAccessLb2 Keepalived_vrrp[2271]: VRRP_Instance(ProdWebCluster) Entering BACKUP STATE

主服务器

Oct 10 04:14:35 Prod-WebAccessLb1 Keepalived_vrrp[1311]: VRRP_Instance(ProdWebCluster) Received advert with lower priority 100, ours 200, forcing new election
Oct 10 04:14:35 Prod-WebAccessLb1 Keepalived_vrrp[1311]: VRRP_Instance(ProdWebCluster) Received advert with lower priority 100, ours 200, forcing new election

因此，查看日志后，我们可以说，故障转移后，MASTER 节点立即触发故障回复，但它不会运行 master.sh，并且 VIP 处于悬而未决的状态。

以下是主配置

vrrp_script chk_haproxy {
script "/bin/pidof haproxy"
interval 1
}

vrrp_instance ProdWebCluster {
debug 2
interface eth0
state MASTER
virtual_router_id 33
priority 151
unicast_src_ip 10.186.2.10

unicast_peer {
10.186.6.10
}

authentication {
auth_type PASS
auth_pass xxx
}


track_script {
chk_haproxy
}

备份服务器配置

vrrp_instance ProdWebCluster {
debug 2
interface eth0
state BACKUP
virtual_router_id 33
priority 100
unicast_src_ip 10.186.6.10

unicast_peer {
10.186.2.10
}

authentication {
auth_type PASS
auth_pass xxxx
}

track_script {
chk_haproxy
}

有人能告诉我为什么它首先触发故障转移吗？以及为什么在故障回复期间它不运行 master.sh？

注意：当我从 Backup/Master 手动运行 master.sh 脚本时，它按照预期的方式工作。

相关内容