两台服务器都启动了keepalived,BACKUP服务器立刻转换到MASTER STATE,两台服务器都成为了MASTER。
两个节点都在发送 VRRP 通告消息。
在主服务器上:
[root@zhsq1 ~]# tcpdump -c 3 -i em1 host 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em1, link-type EN10MB (Ethernet), capture size 65535 bytes
11:01:35.526355 IP zhsq1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 153, authtype simple, intvl 1s, length 20
11:01:36.526497 IP zhsq1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 153, authtype simple, intvl 1s, length 20
11:01:37.527561 IP zhsq1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 153, authtype simple, intvl 1s, length 20
在备份服务器上:
[root@zhsq2 ~]# tcpdump -c 3 -i em1 host 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em1, link-type EN10MB (Ethernet), capture size 65535 bytes
11:11:04.314996 IP zhsq2 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 102, authtype simple, intvl 1s, length 20
11:11:05.315111 IP zhsq2 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 102, authtype simple, intvl 1s, length 20
11:11:06.316175 IP zhsq2 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 102, authtype simple, intvl 1s, length 20
以下是主服务器日志:
May 31 11:00:22 zhsq1 Keepalived[31475]: Starting Keepalived v1.2.7 (05/20,2013)
May 31 11:00:22 zhsq1 Keepalived[31476]: Starting Healthcheck child process, pid=31477
May 31 11:00:22 zhsq1 Keepalived[31476]: Starting VRRP child process, pid=31478
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Interface queue is empty
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: No such interface, em2
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Netlink reflector reports IP 10.0.7.60 added
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Netlink reflector reports IP fe80::92b1:1cff:fe4c:bea8 added
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Registering Kernel netlink reflector
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Registering Kernel netlink command channel
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Interface queue is empty
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: No such interface, em2
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Netlink reflector reports IP 10.0.7.60 added
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Netlink reflector reports IP fe80::92b1:1cff:fe4c:bea8 added
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Registering Kernel netlink reflector
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Registering Kernel netlink command channel
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Registering gratuitous ARP shared channel
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Opening file '/etc/keepalived/keepalived.conf'.
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Configuration is using : 4661 Bytes
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Opening file '/etc/keepalived/keepalived.conf'.
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Configuration is using : 63856 Bytes
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Using LinkWatch kernel netlink reflector...
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: VRRP sockpool: [ifindex(2), proto(112), fd(11,12)]
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Using LinkWatch kernel netlink reflector...
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: VRRP_Script(chk_http_port) succeeded
May 31 11:00:23 zhsq1 Keepalived_vrrp[31478]: VRRP_Instance(VI_1) Transition to MASTER STATE
May 31 11:00:24 zhsq1 Keepalived_vrrp[31478]: VRRP_Instance(VI_1) Entering MASTER STATE
May 31 11:00:24 zhsq1 Keepalived_vrrp[31478]: VRRP_Instance(VI_1) setting protocol VIPs.
May 31 11:00:24 zhsq1 Keepalived_vrrp[31478]: VRRP_Instance(VI_1) Sending gratuitous ARPs on em1 for 10.0.7.65
May 31 11:00:24 zhsq1 Keepalived_healthcheckers[31477]: Netlink reflector reports IP 10.0.7.65 added
May 31 11:00:29 zhsq1 Keepalived_vrrp[31478]: VRRP_Instance(VI_1) Sending gratuitous ARPs on em1 for 10.0.7.65
以下是备份服务器日志:
May 31 11:01:50 zhsq2 Keepalived[31250]: Starting Keepalived v1.2.7 (05/20,2013)
May 31 11:01:50 zhsq2 Keepalived[31251]: Starting Healthcheck child process, pid=31252
May 31 11:01:50 zhsq2 Keepalived[31251]: Starting VRRP child process, pid=31253
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Interface queue is empty
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: No such interface, em2
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Netlink reflector reports IP 10.0.7.61 added
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Netlink reflector reports IP fe80::92b1:1cff:fe4c:b8b7 added
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Registering Kernel netlink reflector
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Registering Kernel netlink command channel
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Interface queue is empty
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: No such interface, em2
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Netlink reflector reports IP 10.0.7.61 added
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Netlink reflector reports IP fe80::92b1:1cff:fe4c:b8b7 added
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Registering Kernel netlink reflector
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Registering Kernel netlink command channel
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Registering gratuitous ARP shared channel
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Opening file '/etc/keepalived/keepalived.conf'.
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Configuration is using : 4661 Bytes
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Opening file '/etc/keepalived/keepalived.conf'.
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Configuration is using : 63856 Bytes
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Using LinkWatch kernel netlink reflector...
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: VRRP_Instance(VI_1) Entering BACKUP STATE
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: VRRP sockpool: [ifindex(2), proto(112), fd(11,12)]
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Using LinkWatch kernel netlink reflector...
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: VRRP_Script(chk_http_port) succeeded
May 31 11:01:54 zhsq2 Keepalived_vrrp[31253]: VRRP_Instance(VI_1) Transition to MASTER STATE
May 31 11:01:55 zhsq2 Keepalived_vrrp[31253]: VRRP_Instance(VI_1) Entering MASTER STATE
May 31 11:01:55 zhsq2 Keepalived_vrrp[31253]: VRRP_Instance(VI_1) setting protocol VIPs.
May 31 11:01:55 zhsq2 Keepalived_vrrp[31253]: VRRP_Instance(VI_1) Sending gratuitous ARPs on em1 for 10.0.7.65
May 31 11:01:55 zhsq2 Keepalived_healthcheckers[31252]: Netlink reflector reports IP 10.0.7.65 added
May 31 11:02:00 zhsq2 Keepalived_vrrp[31253]: VRRP_Instance(VI_1) Sending gratuitous ARPs on em1 for 10.0.7.65
主服务器的keepalived配置如下:
vrrp_script chk_http_port {
script "/opt/nginx/nginx_pid.sh"
interval 2
weight 2
}
vrrp_instance VI_1 {
state MASTER
#nopreempt
interface em1
virtual_router_id 51
priority 151
mcast_src_ip 10.0.7.60
track_interface {
em1
}
authentication {
auth_type PASS
auth_pass 1111
}
track_script {
chk_http_port
}
virtual_ipaddress {
10.0.7.65 dev em1
}
}
备份服务器的keepalived配置如下:
vrrp_script chk_http_port {
script "/opt/nginx/nginx_pid.sh"
interval 2
weight 2
}
vrrp_instance VI_1 {
state BACKUP
interface em1
virtual_router_id 51
priority 100
mcast_src_ip 10.0.7.61
track_interface {
em1
}
authentication {
auth_type PASS
auth_pass 1111
}
track_script {
chk_http_port
}
virtual_ipaddress {
10.0.7.65 dev em1
}
}
chk_http_port文件如下:
NGINX_PROCESS=`ps -C nginx --no-header | wc -l`
if [ $NGINX_PROCESS -eq 0 ]; then
/usr/local/nginx/sbin/nginx
sleep 3
if [ `ps -C nginx --no-header | wc -l` -eq 0 ]; then
killall keepalived
fi
fi
请帮我。
多谢。
答案1
em1 接口上的计算机之间没有数据包传递(导致出现脑裂情况,因为迈克说)。
- 检查防火墙以确保数据包没有被拦截
- 检查你的网络,确保两台机器上的 em1 是同一个网络
以下是其中一个数据包的示例:
Frame 2: 54 bytes on wire (432 bits), 54 bytes captured (432 bits)
Arrival Time: Jun 1, 2013 03:39:50.709520000 UTC
Epoch Time: 1370057990.709520000 seconds
[Time delta from previous captured frame: 0.000970000 seconds]
[Time delta from previous displayed frame: 0.000970000 seconds]
[Time since reference or first frame: 0.000970000 seconds]
Frame Number: 2
Frame Length: 54 bytes (432 bits)
Capture Length: 54 bytes (432 bits)
[Frame is marked: False]
[Frame is ignored: False]
[Protocols in frame: eth:ip:vrrp]
Ethernet II, Src: 00:25:90:83:b0:07 (00:25:90:83:b0:07), Dst: 01:00:5e:00:00:12 (01:00:5e:00:00:12)
Destination: 01:00:5e:00:00:12 (01:00:5e:00:00:12)
Address: 01:00:5e:00:00:12 (01:00:5e:00:00:12)
.... ...1 .... .... .... .... = IG bit: Group address (multicast/broadcast)
.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
Source: 00:25:90:83:b0:07 (00:25:90:83:b0:07)
Address: 00:25:90:83:b0:07 (00:25:90:83:b0:07)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
Type: IP (0x0800)
Internet Protocol Version 4, Src: 10.0.10.11 (10.0.10.11), Dst: 224.0.0.18 (224.0.0.18)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
0000 00.. = Differentiated Services Codepoint: Default (0x00)
.... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00)
Total Length: 40
Identification: 0x8711 (34577)
Flags: 0x00
0... .... = Reserved bit: Not set
.0.. .... = Don't fragment: Not set
..0. .... = More fragments: Not set
Fragment offset: 0
Time to live: 255
Protocol: VRRP (112)
Header checksum: 0x4037 [correct]
[Good: True]
[Bad: False]
Source: 10.0.10.11 (10.0.10.11)
Destination: 224.0.0.18 (224.0.0.18)
Virtual Router Redundancy Protocol
Version 2, Packet type 1 (Advertisement)
0010 .... = VRRP protocol version: 2
.... 0001 = VRRP packet type: Advertisement (1)
Virtual Rtr ID: 254
Priority: 151 (Non-default backup priority)
Addr Count: 1
Auth Type: No Authentication (0)
Adver Int: 1
Checksum: 0x3c01 [correct]
IP Address: 10.0.0.254 (10.0.0.254)
答案2
答案3
就我而言,对于 CentOS/RHEL 8,我只需允许防火墙 rich-rulevrrp
协议即可解决此 Keepalived 裂脑问题,其中两个服务器都拥有 VIP IP 地址。我必须添加sysctl
内核标志以允许 HAProxy 绑定到非本地 VIP IP。
对于sysctl
,添加net.ipv4.ip_nonlocal_bind = 1
文件/etc/sysctl.conf
,然后执行sysctl -p
以重新加载sysctl
配置。我需要这个不是用于 Keepalived 裂脑场景,而是用于让 HAProxy 绑定到自己的 IP 地址以进行统计(例如:bind 192.168.0.10:1492/stats
)并绑定到 VIP(虚拟 IP)地址以平衡 Web 流量(bind 192.168.0.34:80
和bind 192.168.0.34:443
)。否则,HAProxy service failed to start stating it cannot bind to ports
80 and
443with the VIP IP address only. I was doing this to avoid having bind
:80and bind
:443`。另外,这似乎是一件不用思考但却很容易被忽视的事情,如果您无法访问统计页面,请检查您是否已允许您用于统计的端口通过防火墙。
对于防火墙,执行以下命令:
# firewall-cmd --add-rich-rule='rule protocol value="vrrp" accept' --permanent
# firewall-cmd --reload
我直接从 RedHat 的 HAProxy 和 Keepalived 文档中找到了这些标志和其他信息:
非本地绑定标志参考(虽然我没有使用 Keepalived 进行负载平衡,但这用于 HAProxy):https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/load_balancer_administration/s1-initial-setup-forwarding-vsa
此外,如果 HAProxy 仍然无法绑定到端口,您可能需要查看阻止它的好用的 SELinux。对我来说,在 CentOS 8 上,我必须semanage port -a -t http_port_t -p tcp 1492
为我的 HAProxy 统计页面做一个。
答案4
这个问题已经解决。
问题出在开关设置上。当多播过滤模式为 时filter-all
,问题就会发生。但当多播过滤模式为 时,问题就会得到解决forward-all
。