PCS 群集状态

PCS 群集状态

我正在尝试在 Rocky Linux 8.5 上使用 PCS 部署 HA NFS 集群架构。当前内核和 nfs 相关软件包版本、pcs 配置详细信息如下所示。

我无法声明要绑定的 NFSD 实例(rpc.statd、rpc.mountd 等)的特定 IP。无论我做什么,服务都会保持绑定0.0.0.0:$default-ports

我想为每个 NFS(块)资源组启动不同的“ocf:Hearthbeat:nfsserver”,并使用特定的 VirtualIP 资源。当我在集群的同一节点上声明第二个 NFS 共享资源组时(我计划拥有比集群大小更多的 NFS),“ocf:Hearthbeat:nfsserver”资源会相互阻塞,最终一个资源获胜,另一个资源进入“阻塞”状态。

[root@node1 ~]# uname -a
Linux node1.local 4.18.0-348.12.2.el8_5.x86_64 #1 SMP Wed Jan 19 17:53:40 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
[root@node1 ~]# rpm -qa nfs* rpc*
nfs-utils-2.3.3-46.el8.x86_64
rpcbind-1.2.5-8.el8.x86_64
[root@node1 ~]# 

PCS 群集状态

[root@node1 ~]#  pcs status
Cluster name: cluster01
Cluster Summary:
  * Stack: corosync
  * Current DC: node5 (version 2.1.0-8.el8-7c3f660707) - partition with quorum
  * Last updated: Thu Mar 24 13:10:09 2022
  * Last change:  Thu Mar 24 13:03:48 2022 by root via crm_resource on node3
  * 5 nodes configured
  * 5 resource instances configured

Node List:
  * Online: [ node1 node2 node3 node4 node5 ]

Full List of Resources:
  * Resource Group: Group_SHARE:
    * ROOT-FS_SHARE (ocf::heartbeat:Filesystem):     Started node2
    * NFSD_SHARE    (ocf::heartbeat:nfsserver):  Started node2
    * NFS_SHARE (ocf::heartbeat:exportfs):   Started node2
    * NFS-IP_SHARE  (ocf::heartbeat:IPaddr2):    Started node2
    * NFS-NOTIFY_SHARE  (ocf::heartbeat:nfsnotify):  Started node2

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@node1 ~]# 

PCS 资源配置输出PCS Resource Config output

[root@node2 ~]# pcs resource config
 Group: Group_SHARE
  Resource: ROOT-FS_SHARE (class=ocf provider=heartbeat type=Filesystem)
   Attributes: device=/dev/disk/by-id/wwn-0x6001405ce6b7033688d497a91aa23547 directory=/srv/block/SHARE fstype=xfs
   Operations: monitor interval=20s timeout=40s (ROOT-FS_SHARE-monitor-interval-20s)
               start interval=0s timeout=60s (ROOT-FS_SHARE-start-interval-0s)
               stop interval=0s timeout=60s (ROOT-FS_SHARE-stop-interval-0s)
  Resource: NFSD_SHARE (class=ocf provider=heartbeat type=nfsserver)
   Attributes: nfs_ip=10.1.31.100 nfs_no_notify=true nfs_shared_infodir=/srv/block/SHARE/nfsinfo/
   Operations: monitor interval=10s timeout=20s (NFSD_SHARE-monitor-interval-10s)
               start interval=0s timeout=40s (NFSD_SHARE-start-interval-0s)
               stop interval=0s timeout=20s (NFSD_SHARE-stop-interval-0s)
  Resource: NFS_SHARE (class=ocf provider=heartbeat type=exportfs)
   Attributes: clientspec=10.1.31.0/255.255.255.0 directory=/srv/block/SHARE/SHARE fsid=0 options=rw,sync,no_root_squash
   Operations: monitor interval=10s timeout=20s (NFS_SHARE-monitor-interval-10s)
               start interval=0s timeout=40s (NFS_SHARE-start-interval-0s)
               stop interval=0s timeout=120s (NFS_SHARE-stop-interval-0s)
  Resource: NFS-IP_SHARE (class=ocf provider=heartbeat type=IPaddr2)
   Attributes: cidr_netmask=24 ip=10.1.31.100 nic=team31
   Operations: monitor interval=30s (NFS-IP_SHARE-monitor-interval-30s)
               start interval=0s timeout=20s (NFS-IP_SHARE-start-interval-0s)
               stop interval=0s timeout=20s (NFS-IP_SHARE-stop-interval-0s)
  Resource: NFS-NOTIFY_SHARE (class=ocf provider=heartbeat type=nfsnotify)
   Attributes: source_host=SHARE.local
   Operations: monitor interval=30s timeout=90s (NFS-NOTIFY_SHARE-monitor-interval-30s)
               reload interval=0s timeout=90s (NFS-NOTIFY_SHARE-reload-interval-0s)
               start interval=0s timeout=90s (NFS-NOTIFY_SHARE-start-interval-0s)
               stop interval=0s timeout=90s (NFS-NOTIFY_SHARE-stop-interval-0s)
[root@node2 ~]#

虚拟 IP 在 node2 上绑定成功

[root@node2 ~]# ip -4 addr show team31
6: team31: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
    inet 10.1.31.2/24 brd 10.1.31.255 scope global noprefixroute team31
       valid_lft forever preferred_lft forever
    inet 10.1.31.100/24 brd 10.1.31.255 scope global secondary team31
       valid_lft forever preferred_lft forever
[root@node2 ~]#

TCP LISTEN 绑定

[root@node2 ~]# netstat -punta | grep LISTEN
tcp        0      0 0.0.0.0:2049            0.0.0.0:*               LISTEN      -                   
tcp        0      0 127.0.0.1:44321         0.0.0.0:*               LISTEN      1803/pmcd           
tcp        0      0 0.0.0.0:34661           0.0.0.0:*               LISTEN      630273/rpc.statd    
tcp        0      0 127.0.0.1:199           0.0.0.0:*               LISTEN      1257/snmpd            
tcp        0      0 127.0.0.1:4330          0.0.0.0:*               LISTEN      2834/pmlogger       
tcp        0      0 10.20.101.136:2379      0.0.0.0:*               LISTEN      3285/etcd           
tcp        0      0 10.20.101.136:2380      0.0.0.0:*               LISTEN      3285/etcd           
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      1/systemd           
tcp        0      0 0.0.0.0:20048           0.0.0.0:*               LISTEN      630282/rpc.mountd   
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      317707/nginx: maste 
tcp        0      0 0.0.0.0:2224            0.0.0.0:*               LISTEN      3725/platform-pytho 
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      1170/sshd           
tcp        0      0 0.0.0.0:41017           0.0.0.0:*               LISTEN      -                   
tcp        0      0 0.0.0.0:443             0.0.0.0:*               LISTEN      317707/nginx: maste 
tcp        0      0 0.0.0.0:35261           0.0.0.0:*               LISTEN      -                   
tcp6       0      0 :::2049                 :::*                    LISTEN      -                   
tcp6       0      0 ::1:44321               :::*                    LISTEN      1803/pmcd           
tcp6       0      0 ::1:4330                :::*                    LISTEN      2834/pmlogger       
tcp6       0      0 :::111                  :::*                    LISTEN      1/systemd           
tcp6       0      0 :::20048                :::*                    LISTEN      630282/rpc.mountd   
tcp6       0      0 :::2224                 :::*                    LISTEN      3725/platform-pytho 
tcp6       0      0 :::37329                :::*                    LISTEN      -                   
tcp6       0      0 :::22                   :::*                    LISTEN      1170/sshd           
tcp6       0      0 :::41179                :::*                    LISTEN      630273/rpc.statd    
tcp6       0      0 :::43487                :::*                    LISTEN      -                   
[root@node2 ~]#

PCS 集群资源(cib.xml 格式,以防您需要深入了解)

<resources>
  <group id="Group_SHARE">
    <primitive class="ocf" id="ROOT-FS_SHARE" provider="heartbeat" type="Filesystem">
      <instance_attributes id="ROOT-FS_SHARE-instance_attributes">
        <nvpair id="ROOT-FS_SHARE-instance_attributes-device" name="device" value="/dev/disk/by-id/wwn-0x6001405ce6b7033688d497a91aa23547"/>
        <nvpair id="ROOT-FS_SHARE-instance_attributes-directory" name="directory" value="/srv/block/SHARE"/>
        <nvpair id="ROOT-FS_SHARE-instance_attributes-fstype" name="fstype" value="xfs"/>
      </instance_attributes>
      <operations>
        <op id="ROOT-FS_SHARE-monitor-interval-20s" interval="20s" name="monitor" timeout="40s"/>
        <op id="ROOT-FS_SHARE-start-interval-0s" interval="0s" name="start" timeout="60s"/>
        <op id="ROOT-FS_SHARE-stop-interval-0s" interval="0s" name="stop" timeout="60s"/>
      </operations>
    </primitive>
    <primitive class="ocf" id="NFSD_SHARE" provider="heartbeat" type="nfsserver">
      <instance_attributes id="NFSD_SHARE-instance_attributes">
        <nvpair id="NFSD_SHARE-instance_attributes-nfs_ip" name="nfs_ip" value="10.1.31.100"/>
        <nvpair id="NFSD_SHARE-instance_attributes-nfs_no_notify" name="nfs_no_notify" value="true"/>
        <nvpair id="NFSD_SHARE-instance_attributes-nfs_shared_infodir" name="nfs_shared_infodir" value="/srv/block/SHARE/nfsinfo/"/>
      </instance_attributes>
      <operations>
        <op id="NFSD_SHARE-monitor-interval-10s" interval="10s" name="monitor" timeout="20s"/>
        <op id="NFSD_SHARE-start-interval-0s" interval="0s" name="start" timeout="40s"/>
        <op id="NFSD_SHARE-stop-interval-0s" interval="0s" name="stop" timeout="20s"/>
      </operations>
    </primitive>
    <primitive class="ocf" id="NFS_SHARE" provider="heartbeat" type="exportfs">
      <instance_attributes id="NFS_SHARE-instance_attributes">
        <nvpair id="NFS_SHARE-instance_attributes-clientspec" name="clientspec" value="10.1.31.0/255.255.255.0"/>
        <nvpair id="NFS_SHARE-instance_attributes-directory" name="directory" value="/srv/block/SHARE/SHARE"/>
        <nvpair id="NFS_SHARE-instance_attributes-fsid" name="fsid" value="0"/>
        <nvpair id="NFS_SHARE-instance_attributes-options" name="options" value="rw,sync,no_root_squash"/>
      </instance_attributes>
      <operations>
        <op id="NFS_SHARE-monitor-interval-10s" interval="10s" name="monitor" timeout="20s"/>
        <op id="NFS_SHARE-start-interval-0s" interval="0s" name="start" timeout="40s"/>
        <op id="NFS_SHARE-stop-interval-0s" interval="0s" name="stop" timeout="120s"/>
      </operations>
    </primitive>
    <primitive class="ocf" id="NFS-IP_SHARE" provider="heartbeat" type="IPaddr2">
      <instance_attributes id="NFS-IP_SHARE-instance_attributes">
        <nvpair id="NFS-IP_SHARE-instance_attributes-cidr_netmask" name="cidr_netmask" value="24"/>
        <nvpair id="NFS-IP_SHARE-instance_attributes-ip" name="ip" value="10.1.31.100"/>
        <nvpair id="NFS-IP_SHARE-instance_attributes-nic" name="nic" value="team31"/>
      </instance_attributes>
      <operations>
        <op id="NFS-IP_SHARE-monitor-interval-30s" interval="30s" name="monitor"/>
        <op id="NFS-IP_SHARE-start-interval-0s" interval="0s" name="start" timeout="20s"/>
        <op id="NFS-IP_SHARE-stop-interval-0s" interval="0s" name="stop" timeout="20s"/>
      </operations>
    </primitive>
    <primitive class="ocf" id="NFS-NOTIFY_SHARE" provider="heartbeat" type="nfsnotify">
      <instance_attributes id="NFS-NOTIFY_SHARE-instance_attributes">
        <nvpair id="NFS-NOTIFY_SHARE-instance_attributes-source_host" name="source_host" value="SHARE.local"/>
      </instance_attributes>
      <operations>
        <op id="NFS-NOTIFY_SHARE-monitor-interval-30s" interval="30s" name="monitor" timeout="90s"/>
        <op id="NFS-NOTIFY_SHARE-reload-interval-0s" interval="0s" name="reload" timeout="90s"/>
        <op id="NFS-NOTIFY_SHARE-start-interval-0s" interval="0s" name="start" timeout="90s"/>
        <op id="NFS-NOTIFY_SHARE-stop-interval-0s" interval="0s" name="stop" timeout="90s"/>
      </operations>
    </primitive>
  </group>
</resources>

编辑-1

它似乎OpenClusterFramework - nfsserver 资源根本不使用“rpc.nfsd -H $nfs_ip”的 nfs_ip 字段。在 RockyLinux 8.5 上,此资源也不允许我们覆盖支持每个 NFS 版本的默认行为。RockyLinux 8.5 使用以下软件包 resource-agents-4.1.1-98.el8_5.2.x86_64。

我将尝试通过指定自定义 pcs systemd 资源来解决我的问题[电子邮件保护]

相关内容