我有2个虚拟centos7节点,其中root可以无密码登录,
我已经像这样配置了 stonith,但是服务没有启动,fencing 没有发生,我对此很陌生,有人可以帮我纠正问题吗~
[root@node1 cluster]# pcs stonith create nub1 fence_virt pcmk_host_list="node1"
[root@node1 cluster]# pcs stonith create nub2 fence_virt pcmk_host_list="node2"
[root@node1 cluster]# pcs stonith show
nub1 (stonith:fence_virt): Stopped
nub2 (stonith:fence_virt): Stopped
[root@node1 cluster]#
[root@node1 cluster]#
[root@node1 cluster]#
[root@node1 cluster]#
[root@node1 cluster]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: node2 (version 1.1.15-11.el7_3.5-e174ec8) - partition with quorum
Last updated: Tue Jul 25 07:03:37 2017 Last change: Tue Jul 25 07:02:00 2017 by root via cibadmin on node1
2 nodes and 3 resources configured
Online: [ node1 node2 ]
Full list of resources:
ClusterIP (ocf::heartbeat:IPaddr2): Started node1
nub1 (stonith:fence_virt): Stopped
nub2 (stonith:fence_virt): Stopped
Failed Actions:
* nub1_start_0 on node1 'unknown error' (1): call=56, status=Error, exitreason='none',
last-rc-change='Tue Jul 25 07:01:34 2017', queued=0ms, exec=7006ms
* nub2_start_0 on node1 'unknown error' (1): call=58, status=Error, exitreason='none',
last-rc-change='Tue Jul 25 07:01:42 2017', queued=0ms, exec=7009ms
* nub1_start_0 on node2 'unknown error' (1): call=54, status=Error, exitreason='none',
last-rc-change='Tue Jul 25 07:01:26 2017', queued=0ms, exec=7010ms
* nub2_start_0 on node2 'unknown error' (1): call=60, status=Error, exitreason='none',
last-rc-change='Tue Jul 25 07:01:34 2017', queued=0ms, exec=7013ms
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
[root@node1 cluster]# pcs stonith fence node2
Error: unable to fence 'node2'
Command failed: No route to host
[root@node1 cluster]# pcs stonith fence nub2
Error: unable to fence 'nub2'
Command failed: No such device
[root@node1 cluster]# ping node2
PING node2 (192.168.100.102) 56(84) bytes of data.
64 bytes from node2 (192.168.100.102): icmp_seq=1 ttl=64 time=0.247 ms
64 bytes from node2 (192.168.100.102): icmp_seq=2 ttl=64 time=0.304 ms
^C
--- node2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.247/0.275/0.304/0.032 ms
答案1
Take a look at the information needed for your fence agent: # pcs resource describe fence_virt
如果没有看到系统日志,我猜您需要将该port=
参数添加到您的 STONITH 设备配置中。从管理程序的角度来看,这些应该是机器名称。
如果情况并非如此:# grep fence_virt /var/log/messages
应该让您指向正确的方向。
您还需要添加位置限制以保持这些设备在正确的节点上运行;用于隔离 nub1 的设备绝不应在 nub1 上运行,用于隔离 nub2 的设备绝不应在 nub2 上运行。
答案2
为了使用fence_virt
,运行节点虚拟机的物理主机需要运行fence_virtd
以响应来自fence_virt
防护代理的传入防护请求。
代理将通过fence_virtd(s)
IP 多播进行通信,因此您需要确保访客和主机之间的 IP 多播连接。默认组播IP地址为225.0.0.12,端口号为1229。
请参阅此处的详细说明:https://wiki.clusterlabs.org/wiki/Guest_Fencing
注意:说明中提到的端口是 TCP,但我认为不可能在多播上使用 TCP,因此很可能是一种误解,多播通信实际上使用 UDP 端口,这与其他多播协议很常见。