我在 2 台 RHEL7.8 VM 上设置了 keepalived,为共享 VIP 提供 HA。VIP 正常运行,并按预期正确切换到每台服务器。我遇到了两个与 MAC 未更新相关的问题。
- 最常见的情况是,Server2 将成为 Master 并接管 VIP。VIP 的流量继续流向 Server1,然后流向实际托管 VIP 的 Server2
- 偶尔我会注意到,前一种情况不会发生,所有流量都会在 Server1 处停止。一堆 SYN 数据包到达 Server1(不是 VIP 主机)并在那里终止。Server2 永远不会获得流量,即使它承载了 VIP。
检查和通知脚本都运行正常。VIP 正确转换到我期望成为主服务器的每个服务器。问题在于 VIP 没有获得更新的 MAC。
我尝试过各种 garp_* 设置,但都没有成功。这是我当前的配置:
服务器1 = 192.168.1.10 服务器2 = 192.168.1.11 VIP = 192.168.1.15 工作站 = 172.16.1.10
服务器1 keepalived.conf
! Configuration File for keepalived
global_defs {
vrrp_garp_master_refresh 10
vrrp_garp_master_refresh_repeat 2
vrrp_garp_lower_prio_repeat 2
vrrp_higher_prio_send_advert true
enable_script_security
script_user root
}
vrrp_script chk_nginx_service {
script "/usr/libexec/keepalived/nginx-ha-check.sh"
interval 2
weight 50
rise 2
fall 2
}
vrrp_instance VI_1 {
state MASTER
interface ens192
virtual_router_id 51
priority 101
advert_int 1
unicast_src_ip 192.168.1.10
unicast_peer {
192.168.1.11
}
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.1.15
}
track_script {
chk_nginx_service
}
notify "/usr/libexec/keepalived/nginx-ha-notify.sh"
}
服务器2 keepalived.conf
! Configuration File for keepalived
global_defs {
vrrp_garp_master_refresh 10
vrrp_garp_master_refresh_repeat 2
vrrp_garp_lower_prio_repeat 2
vrrp_higher_prio_send_advert true
enable_script_security
script_user root
}
vrrp_script chk_nginx_service {
script "/usr/libexec/keepalived/nginx-ha-check.sh"
interval 2
weight 50
rise 2
fall 2
}
vrrp_instance VI_1 {
state MASTER
interface ens192
virtual_router_id 51
priority 101
advert_int 1
unicast_src_ip 192.168.1.11
unicast_peer {
192.168.1.10
}
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.1.15
}
track_script {
chk_nginx_service
}
notify "/usr/libexec/keepalived/nginx-ha-notify.sh"
}
当 keepalived 重新启动或重新加载时,Server1 /var/log/messages
Sep 14 11:33:55 server1 systemd: Reloading LVS and VRRP High Availability Monitor.
Sep 14 11:33:55 server1 systemd: Reloaded LVS and VRRP High Availability Monitor.
Sep 14 11:33:55 server1 Keepalived_vrrp[99145]: Registering Kernel netlink reflector
Sep 14 11:33:55 server1 Keepalived_vrrp[99145]: Registering Kernel netlink command channel
Sep 14 11:33:55 server1 Keepalived_vrrp[99145]: Registering gratuitous ARP shared channel
Sep 14 11:33:55 server1 Keepalived_vrrp[99145]: Opening file '/etc/keepalived/keepalived.conf'.
Sep 14 11:33:55 server1 Keepalived_vrrp[99145]: VRRP_Script(chk_nginx_service) considered successful on reload
Sep 14 11:33:55 server1 Keepalived_vrrp[99145]: Using LinkWatch kernel netlink reflector...
Sep 14 11:33:55 server1 Keepalived_vrrp[99145]: VRRP_Instance(VI_1) Entering BACKUP STATE
Sep 14 11:33:55 server1 Keepalived_vrrp[99145]: VRRP sockpool: [ifindex(2), proto(112), unicast(1), fd(10,11)]
当 keepalived 重新启动或重新加载时,Server2 /var/log/messages
Sep 14 09:33:48 server2 Keepalived_vrrp[21124]: Registering Kernel netlink reflector
Sep 14 09:33:48 server2 Keepalived_vrrp[21124]: Registering Kernel netlink command channel
Sep 14 09:33:48 server2 Keepalived_vrrp[21124]: Registering gratuitous ARP shared channel
Sep 14 09:33:48 server2 Keepalived_vrrp[21124]: Opening file '/etc/keepalived/keepalived.conf'.
Sep 14 09:33:48 server2 Keepalived_vrrp[21124]: VRRP_Script(chk_nginx_service) considered successful on reload
Sep 14 09:33:48 server2 Keepalived_vrrp[21124]: VRRP_Instance(VI_1) setting protocol VIPs.
Sep 14 09:33:48 server2 Keepalived_vrrp[21124]: Using LinkWatch kernel netlink reflector...
Sep 14 09:33:48 server2 Keepalived_vrrp[21124]: VRRP sockpool: [ifindex(2), proto(112), unicast(1), fd(10,13)]
Sep 14 09:33:49 server2 Keepalived_vrrp[21124]: VRRP_Instance(VI_1) Transition to MASTER STATE
Sep 14 09:33:50 server2 Keepalived_vrrp[21124]: Sending gratuitous ARP on ens192 for 192.168.1.15
Sep 14 09:33:50 server2 Keepalived_vrrp[21124]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on ens192 for 192.168.1.15
Sep 14 09:33:50 server2 Keepalived_vrrp[21124]: Sending gratuitous ARP on ens192 for 192.168.1.15
Sep 14 09:34:00 server2 Keepalived_vrrp[21124]: Sending gratuitous ARP on ens192 for 192.168.1.15
Sep 14 09:34:00 server2 Keepalived_vrrp[21124]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on ens192 for 192.168.1.15
Sep 14 09:34:00 server2 Keepalived_vrrp[21124]: Sending gratuitous ARP on ens192 for 192.168.1.15
以下是上述场景中两台服务器的 tcpdump,其中 Server1 成为 Server2 的代理主机
Server1 上的 tcpdump(曾是主服务器,但被迫备份以进行测试)
10:28:51.359055 IP 172.16.1.10.58541 > 192.168.1.15.80: Flags [.], seq 2599:2600, ack 41616, win 1024, length 1: HTTP
10:28:51.359083 IP 172.16.1.10.58542 > 192.168.1.15.80: Flags [.], seq 894:895, ack 469, win 1022, length 1: HTTP
10:28:51.359107 IP 155.155.230.143.58541 > 192.168.1.15.80: Flags [.], seq 2599:2600, ack 41616, win 1024, length 1: HTTP
10:28:51.359117 IP 155.155.230.143.58542 > 192.168.1.15.80: Flags [.], seq 894:895, ack 469, win 1022, length 1: HTTP
10:28:51.366289 IP 192.168.1.15.80 > 192.168.1.10.58542: Flags [.], ack 895, win 31, options [nop,nop,sack 1 {894:895}], length 0
10:28:51.366297 IP 192.168.1.15.80 > 192.168.1.10.58541: Flags [.], ack 2600, win 35, options [nop,nop,sack 1 {2599:2600}], length 0
10:28:51.366309 IP 192.168.1.15.80 > 172.16.1.10.58542: Flags [.], ack 895, win 31, options [nop,nop,sack 1 {894:895}], length 0
10:28:51.366319 IP 192.168.1.15.80 > 172.16.1.10.58541: Flags [.], ack 2600, win 35, options [nop,nop,sack 1 {2599:2600}], length 0
10:28:56.295845 IP 192.168.1.15.80 > 192.168.1.10.58542: Flags [F.], seq 469, ack 895, win 31, length 0
10:28:56.295859 IP 192.168.1.15.80 > 192.168.1.10.58541: Flags [F.], seq 41616, ack 2600, win 35, length 0
10:28:56.295892 IP 192.168.1.15.80 > 172.16.1.10.58542: Flags [F.], seq 469, ack 895, win 31, length 0
10:28:56.295897 IP 192.168.1.15.80 > 172.16.1.10.58541: Flags [F.], seq 41616, ack 2600, win 35, length 0
10:28:56.299555 IP 172.16.1.10.58541 > 192.168.1.15.80: Flags [.], ack 41617, win 1024, length 0
10:28:56.299578 IP 192.168.1.10.58541 > 192.168.1.15.80: Flags [.], ack 41617, win 1024, length 0
10:28:56.299589 IP 172.16.1.10.58542 > 192.168.1.15.80: Flags [.], ack 470, win 1022, length 0
10:28:56.299614 IP 192.168.1.10.58542 > 192.168.1.15.80: Flags [.], ack 470, win 1022, length 0
10:28:56.299789 IP 172.16.1.10.58541 > 192.168.1.15.80: Flags [F.], seq 2600, ack 41617, win 1024, length 0
10:28:56.299808 IP 192.168.1.10.58541 > 192.168.1.15.80: Flags [F.], seq 2600, ack 41617, win 1024, length 0
10:28:56.300063 IP 172.16.1.10.58542 > 192.168.1.15.80: Flags [F.], seq 895, ack 470, win 1022, length 0
10:28:56.300080 IP 192.168.1.10.58542 > 192.168.1.15.80: Flags [F.], seq 895, ack 470, win 1022, length 0
10:28:56.306882 IP 192.168.1.15.80 > 192.168.1.10.58541: Flags [.], ack 2601, win 35, length 0
10:28:56.306911 IP 192.168.1.15.80 > 172.16.1.10.58541: Flags [.], ack 2601, win 35, length 0
Server2 上的 tcpdump(曾是备份,但现在是具有 VIP 的主服务器)
12:27:50.343649 IP 192.168.1.11.58541 > 192.168.1.15.80: Flags [.], ack 9655, win 1024, length 0
12:27:50.343675 IP 192.168.1.15.80 > 192.168.1.11.58541: Flags [.], seq 20659:23419, ack 409, win 32, length 2760: HTTP
12:27:50.343687 IP 192.168.1.15.80 > 192.168.1.11.58541: Flags [P.], seq 23419:24460, ack 409, win 32, length 1041: HTTP
12:27:50.343694 IP 192.168.1.11.58541 > 192.168.1.15.80: Flags [.], ack 12379, win 1024, length 0
12:27:50.354554 IP 192.168.1.11.58541 > 192.168.1.15.80: Flags [.], ack 13759, win 1024, length 0
12:27:50.354864 IP 192.168.1.11.58541 > 192.168.1.15.80: Flags [.], ack 24460, win 1024, length 0
12:27:51.023348 IP 192.168.1.11.58541 > 192.168.1.15.80: Flags [P.], seq 409:843, ack 24460, win 1024, length 434: HTTP: GET /api/dashboards/home HTTP/1.1
12:27:51.039476 IP 192.168.1.15.80 > 192.168.1.11.58541: Flags [P.], seq 24460:26214, ack 843, win 33, length 1754: HTTP: HTTP/1.1 200 OK
12:27:51.050345 IP 192.168.1.11.58541 > 192.168.1.15.80: Flags [.], ack 26214, win 1024, length 0
12:27:51.190099 IP 192.168.1.11.58541 > 192.168.1.15.80: Flags [P.], seq 843:1287, ack 26214, win 1024, length 444: HTTP: GET /api/plugins?core=0&embedded=0 HTTP/1.1
12:27:51.205890 IP 192.168.1.15.80 > 192.168.1.11.58541: Flags [P.], seq 26214:26448, ack 1287, win 34, length 234: HTTP: HTTP/1.1 200 OK
12:27:51.244474 IP 192.168.1.11.58541 > 192.168.1.15.80: Flags [P.], seq 1287:1721, ack 26448, win 1023, length 434: HTTP: GET /api/search?limit=30 HTTP/1.1