我有一个使用 pacemaker 和 corosync 的 3 个节点集群设置 [具有节点 A、B (VIP)、C]。当我B
有意拉动其中一个节点的网线(作为灾难恢复测试)时,该节点A
或C
占用 VIP,但当我过一段时间将网线放回 B 时,VIP 会切换到,B
但情况不应该如此。
我希望A
或C
保留 VIP,以下是我的起搏器配置
configure
primitive baseos-ping-check ocf:pacemaker:ping params host_list="1.2.3.4" multiplier="1000" dampen="0" attempts="2" \
op start interval="0s" timeout="60s" \
op monitor interval="2s" timeout="60s" \
op stop interval="0s" timeout="60s" on-fail="ignore"
primitive baseos-vip-master ocf:heartbeat:IPaddr2 \
params ip="192.67.23.145" iflabel="MR" cidr_netmask="255.255.255.0" \
op start interval="0s" \
op monitor interval="10s" \
op stop interval="0s"
clone cl_baseos-ping-check baseos-ping-check meta interleave="true"
location loc-vip-master vip-master \
rule $id="loc-vip-master-rule" $role="master" 100: #uname eq ECS01 \
rule $id="loc--vip-master-rule-0" $role="master" -inf: not_defined pingd or pingd lte 0
property expected-quorum-votes="1"
property stonith-enabled="false"
property maintenance-mode="false"
property cluster-recheck-interval="5min"
property default-action-timeout="60s"
property pe-error-series-max="500"
property pe-input-series-max="500"
property pe-warn-series-max="500"
property no-quorum-policy="ignore"
property dc-version="1.1.16-94ff4df"
property cluster-infrastructure="corosync"
rsc_defaults resource-stickiness="150"
rsc_defaults migration-threshold="3"
commit
quit
我的 corosync 配置如下:
quorum {
provider: corosync_votequorum
expected_votes : 3
}
totem {
version: 2
# How long before declaring a token lost (ms)
token: 3000
# How many token retransmits before forming a new configuration
token_retransmits_before_loss_const: 10
# How long to wait for join messages in the membership protocol (ms)
join: 60
# How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)
consensus: 3600
# Turn off the virtual synchrony filter
vsftype: none
# Number of messages that may be sent by one processor on receipt of the token
max_messages: 20
# Limit generated nodeids to 31-bits (positive signed integers)
clear_node_high_bit: yes
# Disable encryption
secauth: on
# How many threads to use for encryption/decryption
threads: 0
# Optionally assign a fixed node id (integer)
# nodeid: 1234
# This specifies the mode of redundant ring, which may be none, active, or passive.
rrp_mode: none
interface {
# The following values need to be set based on your environment
ringnumber: 0
bindnetaddr: 10.98.4.0
#mcastaddr: 0.0.0.0
mcastport: 5876
member {
memberaddr: 10.98.4.103
}
member {
memberaddr: 10.98.4.173
}
}
transport: udpu
}
amf {
mode: disabled
}
service {
# Load the Pacemaker Cluster Resource Manager
ver: 0
name: pacemaker
}
aisexec {
user: root
group: root
}
logging {
fileline: off
to_stderr: yes
to_logfile: no
to_syslog: yes
syslog_facility: daemon
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
tags: enter|leave|trace1|trace2|trace3|trace4|trace6
}
}
我的 cib.xml 如下所示:
<cib crm_feature_set="3.0.11" validate-with="pacemaker-2.6" epoch="4" num_updates="0" admin_epoch="0" cib-last-written="Wed Sep 11 14:33:08 2019" update-origin="testje" update-client="cibadmin" update-user="root" have-quorum="1" dc-uuid="183387233">
<configuration>
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<nvpair id="cib-bootstrap-options-have-watchdog" name="have-watchdog" value="false"/>
<nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.1.16-94ff4df"/>
<nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="corosync"/>
<nvpair name="expected-quorum-votes" value="1" id="cib-bootstrap-options-expected-quorum-votes"/>
<nvpair name="stonith-enabled" value="false" id="cib-bootstrap-options-stonith-enabled"/>
<nvpair name="maintenance-mode" value="false" id="cib-bootstrap-options-maintenance-mode"/>
<nvpair name="cluster-recheck-interval" value="5min" id="cib-bootstrap-options-cluster-recheck-interval"/>
<nvpair name="default-action-timeout" value="60s" id="cib-bootstrap-options-default-action-timeout"/>
<nvpair name="pe-error-series-max" value="500" id="cib-bootstrap-options-pe-error-series-max"/>
<nvpair name="pe-input-series-max" value="500" id="cib-bootstrap-options-pe-input-series-max"/>
<nvpair name="pe-warn-series-max" value="500" id="cib-bootstrap-options-pe-warn-series-max"/>
<nvpair name="no-quorum-policy" value="ignore" id="cib-bootstrap-options-no-quorum-policy"/>
</cluster_property_set>
</crm_config>
<nodes>
<node id="183387233" uname="testje"/>
<node id="183387244" uname="0060E057BC25"/>
<node id="183387230" uname="d66c4b1997a0"/>
</nodes>
<resources>
<primitive id="baseos-vip-master" class="ocf" provider="heartbeat" type="IPaddr2">
<instance_attributes id="baseos-vip-master-instance_attributes">
<nvpair name="ip" value="10.238.68.134" id="baseos-vip-master-instance_attributes-ip"/>
<nvpair name="iflabel" value="MR" id="baseos-vip-master-instance_attributes-iflabel"/>
<nvpair name="cidr_netmask" value="24" id="baseos-vip-master-instance_attributes-cidr_netmask"/>
</instance_attributes>
<operations>
<op name="start" interval="0s" id="baseos-vip-master-start-0s"/>
<op name="monitor" interval="10s" id="baseos-vip-master-monitor-10s"/>
<op name="stop" interval="0s" id="baseos-vip-master-stop-0s"/>
</operations>
</primitive>
<clone id="cl_baseos-ping-check">
<meta_attributes id="cl_baseos-ping-check-meta_attributes">
<nvpair name="interleave" value="true" id="cl_baseos-ping-check-meta_attributes-interleave"/>
</meta_attributes>
<primitive id="baseos-ping-check" class="ocf" provider="pacemaker" type="ping">
<instance_attributes id="baseos-ping-check-instance_attributes">
<nvpair name="host_list" value="10.238.68.1" id="baseos-ping-check-instance_attributes-host_list"/>
<nvpair name="multiplier" value="1000" id="baseos-ping-check-instance_attributes-multiplier"/>
<nvpair name="dampen" value="0" id="baseos-ping-check-instance_attributes-dampen"/>
<nvpair name="attempts" value="2" id="baseos-ping-check-instance_attributes-attempts"/>
</instance_attributes>
<operations>
<op name="start" interval="0s" timeout="60s" id="baseos-ping-check-start-0s"/>
<op name="monitor" interval="2s" timeout="60s" id="baseos-ping-check-monitor-2s"/>
<op name="stop" interval="0s" timeout="60s" on-fail="ignore" id="baseos-ping-check-stop-0s"/>
</operations>
</primitive>
</clone>
</resources>
<constraints>
<rsc_location id="loc-baseos-vip-master" rsc="baseos-vip-master">
<rule id="loc-baseos-vip-master-rule" role="master" score="100">
<expression attribute="#uname" operation="eq" value="testje" id="loc-baseos-vip-master-rule-expression"/>
</rule>
<rule id="loc-baseos-vip-master-rule-0" role="master" score="-INFINITY" boolean-op="or">
<expression attribute="pingd" operation="not_defined" id="loc-baseos-vip-master-rule-0-expression"/>
<expression attribute="pingd" operation="lte" value="0" id="loc-baseos-vip-master-rule-0-expression-0"/>
</rule>
</rsc_location>
</constraints>
<rsc_defaults>
<meta_attributes id="rsc-options">
<nvpair name="resource-stickiness" value="150" id="rsc-options-resource-stickiness"/>
<nvpair name="migration-threshold" value="3" id="rsc-options-migration-threshold"/>
</meta_attributes>
</rsc_defaults>
</configuration>
</cib>
上面描述的情况只有当我拔掉某个节点的网线使其离线时才会发生,但是如果我重新启动该节点(即 B),那么 VIP 将坚持到当前节点,即A
或C
。
我注意到的一件事是,当我放回节点的网线B
时,IPaddr2 资源正在调用,findif
但失败了,因为我没有使用nic
名称参数,但我确实提供了cidr_netmask
,所以理想情况下findif
应该解析节点的 IP 地址B
。
有什么方法可以避免失败吗findif
?
答案1
正如我们在您问题下的评论中所提到的:当节点重新插入网络时,它会重新加入集群,发现 VIP 在多个节点上运行,因此集群必须恢复服务(在所有地方停止 VIP,然后重新启动),而它恰好选择了节点 B。
在生产集群中,您将使用 Fencing/STONITH 而不忽略仲裁。当您在该配置中从网络中拔下节点 B 时,带外 STONITH 代理将强制关闭节点 B,从而导致节点 B 以“全新状态”重新加入集群,没有运行任何服务,并且 VIP 不会故障回复到节点 B。