起搏器无法进行故障转移

起搏器无法进行故障转移
node $id="10" db10 \
    attributes standby="off"
node $id="9" db09 \
    attributes standby="off"
primitive drbd_jenkins ocf:linbit:drbd \
    params drbd_resource="r0" \
    op start interval="0s" timeout="60s" \
    op stop interval="0s" timeout="60s"
primitive jenkins lsb:jenkins \
    op monitor interval="15s" \
    op start interval="0s" timeout="90s"
primitive mount_jenkins ocf:heartbeat:Filesystem \
    params device="/dev/drbd0" directory="/var/lib/jenkins/" fstype="ext4" \
    op start timeout="20s" interval="0" \
    op stop timeout="20s" interval="0"
primitive vip-158 ocf:heartbeat:IPaddr2 \
    params ip="x.x.x.158" nic="eth0" cidr_netmask="28" \
    op start interval="0s" timeout="60s" \
    op monitor interval="5s" timeout="20s" \
    op stop interval="0s" timeout="60s" \
    meta target-role="Started"
group jenkins_group jenkins vip-158 mount_jenkins
ms ms_drbd_jenkins drbd_jenkins \
    meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" globally-unique="false" target-role="Master"
colocation drbd_mount inf: ms_drbd_jenkins:Master jenkins_group
order mount_after_drbd inf: ms_drbd_jenkins:promote jenkins_group:start
property $id="cib-bootstrap-options" \
    dc-version="1.1.10-42f2063" \
    cluster-infrastructure="corosync" \
    stonith-enabled="false" \
    last-lrm-refresh="1489005751"
rsc_defaults $id="rsc-options" \
    resource-stickiness="0"

当起搏器启动时,一切都很好:

root@db09:~# crm status
Last updated: Wed Mar  8 21:20:33 2017
Last change: Wed Mar  8 21:15:15 2017 via crm_resource on db10
Stack: corosync
Current DC: db10 (10) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
5 Resources configured



Online: [ db09 db10 ]

Master/Slave Set: ms_drbd_jenkins [drbd_jenkins]
     Masters: [ db09 ]
     Slaves: [ db10 ]
Resource Group: jenkins_group
     jenkins    (lsb:jenkins):  Started db09 
     vip-158    (ocf::heartbeat:IPaddr2):   Started db09 
     mount_jenkins  (ocf::heartbeat:Filesystem):    Started db09

但我无法将 master 移动到 db10,无论是:

crm_resource --resource ms_drbd_jenkins --move --node db10

或者

crm resource migrate ms_drbd_jenkins db10

最糟糕的是,如果我将 db09 节点设置为备用,两者都会成为从属:

root@db09:~# crm node standby db09
root@db09:~# crm status
Last updated: Wed Mar  8 21:27:26 2017
Last change: Wed Mar  8 21:27:24 2017 via crm_attribute on db09
Stack: corosync
Current DC: db10 (10) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
5 Resources configured


Node db09 (9): standby
Online: [ db10 ]

 Master/Slave Set: ms_drbd_jenkins [drbd_jenkins]
     Slaves: [ db09 db10 ]

如果 db10 进入待机状态,它将停止,这是预期的:

root@db09:~# crm node standby db10
root@db09:~# crm status
Last updated: Wed Mar  8 21:28:45 2017
Last change: Wed Mar  8 21:28:44 2017 via crm_attribute on db09
Stack: corosync
Current DC: db10 (10) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
5 Resources configured


Node db10 (10): standby
Online: [ db09 ]

 Master/Slave Set: ms_drbd_jenkins [drbd_jenkins]
     Masters: [ db09 ]
     Stopped: [ db10 ]
 Resource Group: jenkins_group
     jenkins    (lsb:jenkins):  Started db09 
     vip-158    (ocf::heartbeat:IPaddr2):   Started db09 
     mount_jenkins  (ocf::heartbeat:Filesystem):    Started db09 

我在这里做错了什么?

答案1

您的主机托管限制不正确。您告诉集群 DRBD 必须是启动 jenkins_group 的 Master。

请改用以下约束:

colocation cl_jenkins-with-drbd inf: jenkins_group ms_drbd_jenkins:Master
order o_drbd-before-jenkins inf: ms_drbd_jenkins:promote jenkins_group:start

专业提示:请注意约束名称中的“语言”:cl____-with-____, o____-before-____。这与评分后面的资源名称相匹配inf:。如果您遵循约束名称中的withbefore命名约定,它们将变得更易于阅读/管理/故障排除。

相关内容