似乎无法启动 pcs 集群（NFS 集群）disk_fencing 麻烦

2024-6-6 • tag-icon

对于我的一生，我找不到关于如何启动我的 NFS 主动/被动集群的明确答案。我有两个节点，node1 和 node2，并按照此处的指南进行操作：https://www.linuxtechi.com/configure-nfs-server-clustering-pacemaker-centos-7-rhel-7/

这是我的日志：

May 25 10:35:59 node1 stonith-ng[3924]:  notice: Couldn't find anyone to fence (on) node1 with any device
May 25 10:35:59 node1 stonith-ng[3924]:   error: Operation on of node1 by <no-one> for [email protected]: No route to host
May 25 10:35:59 node1 crmd[3928]:  notice: Stonith operation 142/2:72:0:f3e078bf-24f5-4160-95c1-0eeeea0e5e12: No route to host (-113)
May 25 10:35:59 node1 crmd[3928]:  notice: Stonith operation 142 for node1 failed (No route to host): aborting transition.
May 25 10:35:59 node1 crmd[3928]: warning: Too many failures (71) to fence node1, giving up
May 25 10:35:59 node1 crmd[3928]:  notice: Transition aborted: Stonith failed
May 25 10:35:59 node1 crmd[3928]:   error: Unfencing of node1 by <anyone> failed: No route to host (-113)
May 25 10:35:59 node1 stonith-ng[3924]:  notice: Couldn't find anyone to fence (on) node2 with any device
May 25 10:35:59 node1 stonith-ng[3924]:   error: Operation on of node2 by <no-one> for [email protected]: No route to host
May 25 10:35:59 node1 crmd[3928]:  notice: Stonith operation 143/1:72:0:f3e078bf-24f5-4160-95c1-0eeeea0e5e12: No route to host (-113)
May 25 10:35:59 node1 crmd[3928]:  notice: Stonith operation 143 for node2 failed (No route to host): aborting transition.
May 25 10:35:59 node1 crmd[3928]: warning: Too many failures (71) to fence node2, giving up
May 25 10:35:59 node1 crmd[3928]:   error: Unfencing of node2 by <anyone> failed: No route to host (-113)

这是状态：

[root@node1 ~]# pcs status
Cluster name: nfs_cluster
Stack: corosync
Current DC: node1 (version 1.1.20-5.amzn2.0.2-3c4c782f70) - partition with quorum
Last updated: Mon May 25 10:45:56 2020
Last change: Sun May 24 21:04:55 2020 by root via cibadmin on node1

2 nodes configured
5 resources configured

Online: [ node1 node2 ]

Full list of resources:

 disk_fencing   (stonith:fence_scsi):   Stopped
 Resource Group: nfsgrp
     nfsshare   (ocf::heartbeat:Filesystem):    Stopped
     nfsd       (ocf::heartbeat:nfsserver):     Stopped
     nfsroot    (ocf::heartbeat:exportfs):      Stopped
     nfsip      (ocf::heartbeat:IPaddr2):       Stopped

Failed Fencing Actions:
* unfencing of node2 failed: delegate=, client=crmd.3928, origin=node1,
    last-failed='Mon May 25 10:35:59 2020'
* unfencing of node1 failed: delegate=, client=crmd.3928, origin=node1,
    last-failed='Mon May 25 10:35:59 2020'

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@node1 ~]#

disk_fencing 设置为 scsi，但不确定这是否是两个 AWS ec2 实例的最佳选项。也许我无法让 disk_fencing 工作，因此它无法启动？我可以从节点 2 ping 节点 1，反之亦然。开放的想法...

相关内容