Pacemaker ClusterIP 在 15 分钟后停止工作，但仍在运行

2024-5-30 • tag-icon

我正在通过 cman 堆栈运行 Corosync 和 Pacemaker，在主动-主动设置中提供网页，但我遇到了瓶颈。我使用 IPaddr2 资源代理来获得两个节点同时使用的 IP，大约 15 分钟后，它停止工作。它仍然根据 PCS 运行，ClusterIP 规则仍然在 iptables 中，但 IP 变得无法访问。如果我重新启动 iptables，集群 ip 会再工作十五分钟，然后再次停止。这是我的 CIB 配置：

Resources:
 Master: RepoDataClone
  Meta Attrs: master-max=2 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
  Resource: RepoData (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=fstore
   Operations: start interval=0s timeout=240 (RepoData-start-timeout-240)
               promote interval=0s timeout=90 (RepoData-promote-timeout-90)
               demote interval=0s timeout=90 (RepoData-demote-timeout-90)
               stop interval=0s timeout=100 (RepoData-stop-timeout-100)
               monitor interval=30s (RepoData-monitor-interval-30s)
 Resource: UpdateService (class=lsb type=rp-doReplicate)
  Operations: monitor interval=30s (UpdateService-monitor-interval-30s)
 Clone: ClusterIP-clone
  Meta Attrs: clone-max=2 clone-node-max=2 globally-unique=true interleave=false
  Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
   Attributes: ip=10.140.14.1 cidr_netmask=24 clusterip_hash=sourceip nic=eth0 broadcast=10.140.14.255 arp_interval=500 arp_bg=yes
   Meta Attrs: resource-stickiness=0
   Operations: start interval=0s timeout=20s (ClusterIP-start-timeout-20s)
               stop interval=0s timeout=20s (ClusterIP-stop-timeout-20s)
               monitor interval=5s (ClusterIP-monitor-interval-5s)
 Clone: Repo-clone
  Meta Attrs: interleave=false
  Resource: Repo (class=ocf provider=heartbeat type=apache)
   Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://localhost:443/server-status
   Operations: start interval=0s timeout=40s (Repo-start-timeout-40s)
               stop interval=0s timeout=60s (Repo-stop-timeout-60s)
               monitor interval=30s (Repo-monitor-interval-30s)
 Clone: RepoFS-clone
  Meta Attrs: interleave=true
  Resource: RepoFS (class=ocf provider=heartbeat type=Filesystem)
   Attributes: device=/dev/drbd1 directory=/repo fstype=gfs2
   Operations: start interval=0s timeout=60 (RepoFS-start-timeout-60)
               stop interval=0s timeout=60 (RepoFS-stop-timeout-60)
               monitor interval=10s (RepoFS-monitor-interval-10s)
 Clone: dlm-clone
  Resource: dlm (class=ocf provider=pacemaker type=controld)
   Operations: start interval=0s timeout=90 (dlm-start-timeout-90)
               stop interval=0s timeout=100 (dlm-stop-timeout-100)
               monitor interval=30s (dlm-monitor-interval-30s)

Stonith Devices:
Fencing Levels:

Location Constraints:
Ordering Constraints:
  start ClusterIP-clone then start Repo-clone (kind:Mandatory) (id:order-ClusterIP-Repo-mandatory)
  promote RepoDataClone then start RepoFS-clone (kind:Mandatory) (id:order-RepoDataClone-RepoFS-mandatory)
  start RepoFS-clone then start Repo-clone (kind:Mandatory) (id:order-RepoFS-Repo-mandatory)
  start dlm-clone then start RepoFS-clone (kind:Mandatory) (id:order-dlm-RepoFS-mandatory)
  start RepoFS then start UpdateService (kind:Mandatory) (id:order-RepoFS-UpdateService-mandatory)
Colocation Constraints:
  Repo-clone with ClusterIP-clone (score:INFINITY) (id:colocation-Repo-ClusterIP-INFINITY)
  RepoFS-clone with dlm-clone (score:INFINITY) (id:colocation-RepoFS-dlm-INFINITY)
  RepoFS-clone with RepoDataClone (score:INFINITY) (id:colocation-RepoFS-RepoDataClone-INFINITY)

Cluster Properties:
 cluster-infrastructure: cman
 dc-version: 1.1.12-1.1.12+git20140723.483f48a
 default-resource-stickiness: 0
 expected-quorum-votes: 2
 last-lrm-refresh: 1432927197
 no-quorum-policy: freeze
 stonith-enabled: false

我到处搜索，但似乎找不到原因。有人知道是什么原因导致我的问题或如何解决吗？

答案1

默认情况下，资源代理 IPaddr2 仅检查 IP 地址是否已配置，您可以检查此阅读https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/IPaddr2

如果你想使用 icmp 协议监控连接，你可以阅读此链接http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_moving_resources_due_to_connectivity_changes.html

答案1

相关内容