起搏器防护未触发

起搏器防护未触发

我有 2 个节点: - patroni1 : 192.168.1.38 - patroni2 : 192.168.1.39

和虚拟 IP :192.168.1.40

我在两者上都安装了 HA-Proxy。

这是当 VIP 连接到 patroni2 并且 haproxy 在 patroni2 上激活时我的电脑状态

-----------
[root@patroni1 ~]# pcs status
Cluster name: haproxy_cluster
Stack: corosync
Current DC: patroni2 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Thu Nov 29 21:29:00 2018
Last change: Thu Nov 29 21:24:52 2018 by root via cibadmin on patroni1

2 nodes configured
4 resources configured

Online: [ patroni1 patroni2 ]

Full list of resources:

 xen-fencing-patroni2   (stonith:fence_xenapi): Started patroni1
 xen-fencing-patroni1   (stonith:fence_xenapi): Started patroni2
 Resource Group: HAproxyGroup
     haproxy    (ocf::heartbeat:haproxy):   Started patroni2
     VIP    (ocf::heartbeat:IPaddr2):   Started patroni2

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@patroni1 ~]# pcs resource show VIP
 Resource: VIP (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: cidr_netmask=24 ip=192.168.1.40
  Operations: monitor interval=1s (VIP-monitor-interval-1s)
              start interval=0s timeout=20s (VIP-start-interval-0s)
              stop interval=0s timeout=20s (VIP-stop-interval-0s)
[root@patroni1 ~]# pcs resource show haproxy
 Resource: haproxy (class=ocf provider=heartbeat type=haproxy)
  Attributes: binpath=/usr/sbin/haproxy conffile=/etc/haproxy/haproxy.cfg
  Operations: monitor interval=10s (haproxy-monitor-interval-10s)
              start interval=0s timeout=20s (haproxy-start-interval-0s)
              stop interval=0s timeout=20s (haproxy-stop-interval-0s)

-----------

我的问题是:每当我手动杀死 patroni2 上的 haproxy 时,都不会触发隔离。仅当我手动停止或重新启动 patroni2 时才会触发隔离。

这是我手动关闭 haproxy 时的 PC 状态

------------
[root@patroni1 ~]# pcs status
Cluster name: haproxy_cluster
Stack: corosync
Current DC: patroni2 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Thu Nov 29 21:37:37 2018
Last change: Thu Nov 29 21:24:52 2018 by root via cibadmin on patroni1

2 nodes configured
4 resources configured

Online: [ patroni1 patroni2 ]

Full list of resources:

 xen-fencing-patroni2   (stonith:fence_xenapi): Started patroni1
 xen-fencing-patroni1   (stonith:fence_xenapi): Started patroni2
 Resource Group: HAproxyGroup
     haproxy    (ocf::heartbeat:haproxy):   Started patroni2
     VIP    (ocf::heartbeat:IPaddr2):   Starting patroni2

Failed Actions:
* haproxy_monitor_10000 on patroni2 'not running' (7): call=38, status=complete, exitreason='',
    last-rc-change='Thu Nov 29 21:37:36 2018', queued=0ms, exec=0ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

------------

当 HA-Proxy 没有响应时,如何触发隔离?

真诚的 -bino-

答案1

您观察到的是预期的行为。资源停止运行并不意味着最好的做法是强制关闭系统电源。

您手动终止 HA-Proxy,Pacemaker 检测到此服务由于某种原因未运行,并记录此故障:haproxy_monitor_10000 on patroni2 'not running' [...]。然后,集群重新启动此服务。我认为此操作成功,因为集群现在显示该服务在同一个 patroni2 节点上正常运行。

监视操作失败不被视为致命故障,因此不会升级为 STONITH 操作。但是,停止操作失败被视为致命故障。如果集群无法停止资源,它如何重新启动或进行故障转移?通过隔离节点并通过 STONITH 对其进行电源循环。

相关内容