我无法在 xen domU 上正确运行 keepalived。
我正在关注关联配置,它在某些本地虚拟机(使用 KVM 运行)上运行良好。如果我设置完全相同的配置,但在 xen domU 上,它不起作用:两个服务器看不到对方并决定成为主服务器(10.10.0.200 是虚拟 IP)
$ sudo ip addr sh eth0 # host1
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:16:3e:73:b0:78 brd ff:ff:ff:ff:ff:ff
inet 10.10.0.100/24 brd 10.10.0.255 scope global eth0
inet 10.10.0.200/32 scope global eth0
inet6 fe80::216:3eff:fe73:b078/64 scope link
valid_lft forever preferred_lft forever
$ sudo ip addr sh eth0 # host2
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:16:3e:ee:5e:fd brd ff:ff:ff:ff:ff:ff
inet 10.10.0.101/24 brd 10.10.0.255 scope global eth0
inet 10.10.0.200/32 scope global eth0
inet6 fe80::216:3eff:feee:5efd/64 scope link
valid_lft forever preferred_lft forever
有什么方法可以调试这个吗 - 似乎有些人能够按照一些邮件列表在 xen 上使用 keepalived,但没有太多关于他们的配置的信息。
域 0 有两个“真实”的以太网卡,eth0 和 eth1,其中 eth0 连接到网络:
- eth0 正在监听 192.168.3.9
- eth1 正在监听 10.10.0.1
我的 xend 配置是:
(xend-relocation-server no)
(network-script 'network-nat netdev=eth1')
(vif-script vif-nat)
(dom0-min-mem 1024)
(enable-dom0-ballooning no)
(total_available_memory 0)
(dom0-cpus 0)
(vncpasswd '')
xend 中 /etc/hosts 中的相关部分是:
10.10.0.100 test1 test1
10.10.0.101 test2 test2
每个 domU(test1 和 test2)分别配置为 10.10.0.100 和 10.10.0.101。每个 domU 都可以通过这些名称相互 ping 通(目前通过 /etc/hosts 手动配置)。虚拟 IP 为 10.10.0.200
请注意,目前,我不太关心 dom0 中的网络配置(桥接与...),我希望第一步就是让 keepalived 在 domU 之间工作
dom0 上的当前 IP 表:
# Generated by iptables-save v1.4.8 on Tue Apr 19 12:52:04 2011
*filter
:INPUT ACCEPT [37536:5302365]
:FORWARD ACCEPT [5367:1221790]
:OUTPUT ACCEPT [30601:3514407]
-A FORWARD -m state --state RELATED,ESTABLISHED -m physdev --physdev-out vif8.0 -j ACCEPT
-A FORWARD -p udp -m physdev --physdev-in vif8.0 -m udp --sport 68 --dport 67 -j ACCEPT
-A FORWARD -m state --state RELATED,ESTABLISHED -m physdev --physdev-out vif8.0 -j ACCEPT
-A FORWARD -s 10.10.0.101/32 -m physdev --physdev-in vif8.0 -j ACCEPT
COMMIT
# Completed on Tue Apr 19 12:52:04 2011
# Generated by iptables-save v1.4.8 on Tue Apr 19 12:52:04 2011
*nat
:PREROUTING ACCEPT [1441667:468129452]
:POSTROUTING ACCEPT [608454:36641119]
:OUTPUT ACCEPT [608448:36640127]
-A POSTROUTING -o eth1 -j MASQUERADE
-A POSTROUTING -o eth1 -j MASQUERADE
-A POSTROUTING -o eth1 -j MASQUERADE
-A POSTROUTING -s 10.10.0.0/24 -o eth0 -j SNAT --to-source 192.168.3.9
COMMIT
# Completed on Tue Apr 19 12:52:04 2011
至于保持活动配置:
# test1 config
vrrp_script chk_haproxy { # Requires keepalived-1.1.13
script "killall -0 haproxy" # cheaper than pidof
interval 2 # check every 2 seconds
weight 2 # add 2 points of prio if OK
}
vrrp_instance VI_1 {
interface eth0
state MASTER
virtual_router_id 51
priority 101 # 101 on master, 100 on backup
virtual_ipaddress {
10.10.0.200
}
track_script {
chk_haproxy
}
}
对于测试2:
vrrp_script chk_haproxy { # Requires keepalived-1.1.13
script "killall -0 haproxy" # cheaper than pidof
interval 2 # check every 2 seconds
weight 2 # add 2 points of prio if OK
}
vrrp_instance VI_1 {
interface eth0
state MASTER
virtual_router_id 51
priority 100 # 101 on master, 100 on backup
virtual_ipaddress {
10.10.0.200
}
track_script {
chk_haproxy
}
}
每个主机都可以互相“arping”:
# on test1
sudo arping test2
ARPING 10.10.0.101 from 10.10.0.100 eth0
Unicast reply from 10.10.0.101 [FE:FF:FF:FF:FF:FF] 751.879ms
Unicast reply from 10.10.0.101 [FE:FF:FF:FF:FF:FF] 0.626ms
...
# on test2
sudo arping test1
ARPING 10.10.0.100 from 10.10.0.101 eth0
Unicast reply from 10.10.0.100 [FE:FF:FF:FF:FF:FF] 105.399ms
Unicast reply from 10.10.0.100 [FE:FF:FF:FF:FF:FF] 0.655ms
[编辑] 如果我从 keepalived 配置中删除 track_script 行并重新启动,我会收到以下日志:
Apr 19 13:35:06 test1 Keepalived: Terminating on signal
Apr 19 13:35:06 test1 Keepalived: Stopping Keepalived v1.1.20 (08/18,2010)
Apr 19 13:35:06 test1 Keepalived_vrrp: Terminating VRRP child process on signal
Apr 19 13:35:06 test1 Keepalived_healthcheckers: Terminating Healthchecker child process on signal
Apr 19 13:35:07 test1 Keepalived: Starting Keepalived v1.1.20 (08/18,2010)
Apr 19 13:35:07 test1 Keepalived: Starting Healthcheck child process, pid=4848
Apr 19 13:35:07 test1 Keepalived: Starting VRRP child process, pid=4849
Apr 19 13:35:07 test1 Keepalived_healthcheckers: Initializing ipvs 2.6
Apr 19 13:35:07 test1 Keepalived_vrrp: Registering Kernel netlink reflector
Apr 19 13:35:07 test1 Keepalived_vrrp: Registering Kernel netlink command channel
Apr 19 13:35:07 test1 Keepalived_vrrp: Registering gratutious ARP shared channel
Apr 19 13:35:07 test1 Keepalived_vrrp: Initializing ipvs 2.6
Apr 19 13:35:07 test1 Keepalived_healthcheckers: IPVS: Can't initialize ipvs: Protocol not available
Apr 19 13:35:07 test1 Keepalived_healthcheckers: Registering Kernel netlink reflector
Apr 19 13:35:07 test1 Keepalived_healthcheckers: Registering Kernel netlink command channel
Apr 19 13:35:07 test1 Keepalived_healthcheckers: Opening file '/etc/keepalived/keepalived.conf'.
Apr 19 13:35:07 test1 Keepalived_vrrp: IPVS: Can't initialize ipvs: Protocol not available
Apr 19 13:35:07 test1 Keepalived_vrrp: Opening file '/etc/keepalived/keepalived.conf'.
Apr 19 13:35:07 test1 Keepalived_healthcheckers: Configuration is using : 3103 Bytes
Apr 19 13:35:07 test1 Keepalived_healthcheckers: Using LinkWatch kernel netlink reflector...
Apr 19 13:35:07 test1 Keepalived_vrrp: Configuration is using : 31958 Bytes
Apr 19 13:35:07 test1 Keepalived_vrrp: Using LinkWatch kernel netlink reflector...
Apr 19 13:35:08 test1 Keepalived_vrrp: VRRP_Instance(VI_1) Transition to MASTER STATE
Apr 19 13:35:09 test1 Keepalived_vrrp: VRRP_Instance(VI_1) Entering MASTER STATE
和:
Apr 19 13:34:43 test2 Keepalived: Terminating on signal
Apr 19 13:34:43 test2 Keepalived: Stopping Keepalived v1.1.20 (08/18,2010)
Apr 19 13:34:43 test2 Keepalived_vrrp: Terminating VRRP child process on signal
Apr 19 13:34:43 test2 Keepalived_healthcheckers: Terminating Healthchecker child process on signal
Apr 19 13:34:44 test2 Keepalived: Starting Keepalived v1.1.20 (08/18,2010)
Apr 19 13:34:44 test2 Keepalived: Starting Healthcheck child process, pid=3811
Apr 19 13:34:44 test2 Keepalived: Starting VRRP child process, pid=3812
Apr 19 13:34:44 test2 Keepalived_healthcheckers: Initializing ipvs 2.6
Apr 19 13:34:44 test2 Keepalived_vrrp: Registering Kernel netlink reflector
Apr 19 13:34:44 test2 Keepalived_vrrp: Registering Kernel netlink command channel
Apr 19 13:34:44 test2 Keepalived_vrrp: Registering gratutious ARP shared channel
Apr 19 13:34:44 test2 Keepalived_vrrp: Initializing ipvs 2.6
Apr 19 13:34:44 test2 Keepalived_healthcheckers: IPVS: Can't initialize ipvs: Protocol not available
Apr 19 13:34:44 test2 Keepalived_healthcheckers: Registering Kernel netlink reflector
Apr 19 13:34:44 test2 Keepalived_healthcheckers: Registering Kernel netlink command channel
Apr 19 13:34:44 test2 Keepalived_healthcheckers: Opening file '/etc/keepalived/keepalived.conf'.
Apr 19 13:34:44 test2 Keepalived_vrrp: IPVS: Can't initialize ipvs: Protocol not available
Apr 19 13:34:44 test2 Keepalived_healthcheckers: Configuration is using : 3103 Bytes
Apr 19 13:34:44 test2 Keepalived_healthcheckers: Using LinkWatch kernel netlink reflector...
Apr 19 13:34:44 test2 Keepalived_vrrp: Opening file '/etc/keepalived/keepalived.conf'.
Apr 19 13:34:44 test2 Keepalived_vrrp: Configuration is using : 31958 Bytes
Apr 19 13:34:44 test2 Keepalived_vrrp: Using LinkWatch kernel netlink reflector...
Apr 19 13:34:45 test2 Keepalived_vrrp: VRRP_Instance(VI_1) Transition to MASTER STATE
Apr 19 13:34:46 test2 Keepalived_vrrp: VRRP_Instance(VI_1) Entering MASTER STATE
答案1
'状态 MASTER' 会使事情变得混乱,因为它们最初都会转换为 MASTER 并假定 IP(根据您的日志) - 您只希望其中一个是 MASTER,而另一个是 BACKUP(因此一个从被动状态开始)。
但是,由于他们俩大概都担任 MASTER,这意味着他们无法看到彼此的 VRRP 公告(如果可以看到对方的 VRRP 公告,则其中一个人在看到更高优先级的公告后就会退出)。
检查您是否可以看到来自两个主机的多播流量(tcpdump 多播)。
编辑:废话,刚刚意识到这已经很旧了 - 但对于其他使用 keepalived 的人来说可能有用。
答案2
您将它们都设置为“state MASTER”,这可能会导致 VRRP 公告混乱,即使优先级不同。尝试将 test2 设置为“state BACKUP”。这在过去已经帮我解决了这个问题。
这也让我感觉到有些事情正在发生。
Apr 19 13:34:44 test2 Keepalived_healthcheckers: IPVS: Can't initialize ipvs: Protocol not available
我会检查 lsmod | grep ip 并确保您已经为 ipvs 加载了内核模块。
希望这可以帮助。