无论出于何种原因,我不再能够移动资源pcs
pacemaker-1.1.16-12.el7_4.8.x86_64
corosync-2.4.0-9.el7_4.2.x86_64
pcs-0.9.158-6.el7.centos.1.x86_64
Linux server_a.test.local 3.10.0-693.el7.x86_64
我有 4 个资源配置为资源组的一部分。这是当我尝试将ClusterIP
资源从使用移动server_d
到server_a
使用时的操作日志pcs resource move ClusterIP servr_a.test.local
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_process_request: Forwarding cib_delete operation for section constraints to all (origin=local/crm_resource/3)
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: Diff: --- 0.24.0 2
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: Diff: +++ 0.25.0 (null)
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: -- /cib/configuration/constraints/rsc_location[@id='cli-prefer-ClusterIP']
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: + /cib: @epoch=25
Apr 06 12:16:26 [17292] server_d.test.local crmd: info: abort_transition_graph: Transition aborted by deletion of rsc_location[@id='cli-prefer-ClusterIP']: Configuration change | cib=0.25.0 source=te_update_diff:456 path=/cib/configuration/constraints/rsc_location[@id='cli-prefer-ClusterIP'] complete=true
Apr 06 12:16:26 [17292] server_d.test.local crmd: notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE | input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_process_request: Completed cib_delete operation for section constraints: OK (rc=0, origin=server_d.test.local/crm_resource/3, version=0.25.0)
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: determine_online_status: Node server_a.test.local is online
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: determine_online_status: Node server_d.test.local is online
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 1 is already processed
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 2 is already processed
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 1 is already processed
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 2 is already processed
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: group_print: Resource Group: my_app
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: common_print: ClusterIP (ocf::heartbeat:IPaddr2): Started server_d.test.local
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: common_print: Apache (systemd:httpd): Started server_d.test.local
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: common_print: stunnel (systemd:stunnel-my_app): Started server_d.test.local
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: common_print: my_app-daemon (systemd:my_app): Started server_d.test.local
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: LogActions: Leave ClusterIP (Started server_d.test.local)
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: LogActions: Leave Apache (Started server_d.test.local)
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: LogActions: Leave stunnel (Started server_d.test.local)
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: LogActions: Leave my_app-daemon (Started server_d.test.local)
Apr 06 12:16:26 [17291] server_d.test.local pengine: notice: process_pe_message: Calculated transition 8, saving inputs in /var/lib/pacemaker/pengine/pe-input-18.bz2
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_process_request: Forwarding cib_modify operation for section constraints to all (origin=local/crm_resource/4)
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: Diff: --- 0.25.0 2
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: Diff: +++ 0.26.0 (null)
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: + /cib: @epoch=26
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: ++ /cib/configuration/constraints: <rsc_location id="cli-prefer-ClusterIP" rsc="ClusterIP" role="Started" node="server_a.test.local" score="INFINITY"/>
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_process_request: Completed cib_modify operation for section constraints: OK (rc=0, origin=server_d.test.local/crm_resource/4, version=0.26.0)
Apr 06 12:16:26 [17292] server_d.test.local crmd: info: abort_transition_graph: Transition aborted by rsc_location.cli-prefer-ClusterIP 'create': Configuration change | cib=0.26.0 source=te_update_diff:456 path=/cib/configuration/constraints complete=true
Apr 06 12:16:26 [17292] server_d.test.local crmd: info: handle_response: pe_calc calculation pe_calc-dc-1523016986-67 is obsolete
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: determine_online_status: Node server_a.test.local is online
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: determine_online_status: Node server_d.test.local is online
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 1 is already processed
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 2 is already processed
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 1 is already processed
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 2 is already processed
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: group_print: Resource Group: my_app
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: common_print: ClusterIP (ocf::heartbeat:IPaddr2): Started server_d.test.local
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: common_print: Apache (systemd:httpd): Started server_d.test.local
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: common_print: stunnel (systemd:stunnel-my_app): Started server_d.test.local
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: common_print: my_app-daemon (systemd:my_app): Started server_d.test.local
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: LogActions: Leave ClusterIP (Started server_d.test.local)
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: LogActions: Leave Apache (Started server_d.test.local)
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: LogActions: Leave stunnel (Started server_d.test.local)
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: LogActions: Leave my_app-daemon (Started server_d.test.local)
Apr 06 12:16:27 [17291] server_d.test.local pengine: notice: process_pe_message: Calculated transition 9, saving inputs in /var/lib/pacemaker/pengine/pe-input-19.bz2
Apr 06 12:16:27 [17292] server_d.test.local crmd: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response
Apr 06 12:16:27 [17292] server_d.test.local crmd: info: do_te_invoke: Processing graph 9 (ref=pe_calc-dc-1523016987-68) derived from /var/lib/pacemaker/pengine/pe-input-19.bz2
Apr 06 12:16:27 [17292] server_d.test.local crmd: notice: run_graph: Transition 9 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-19.bz2): Complete
Apr 06 12:16:27 [17292] server_d.test.local crmd: info: do_log: Input I_TE_SUCCESS received in state S_TRANSITION_ENGINE from notify_crmd
Apr 06 12:16:27 [17292] server_d.test.local crmd: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE | input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_backup: Archived previous version as /var/lib/pacemaker/cib/cib-34.raw
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_write_with_digest: Wrote version 0.25.0 of the CIB to disk (digest: 7511cba55b6c2f2f481a51d5585b8d36)
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_write_with_digest: Reading cluster configuration file /var/lib/pacemaker/cib/cib.tPIv7m (digest: /var/lib/pacemaker/cib/cib.OwHiKz)
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_backup: Archived previous version as /var/lib/pacemaker/cib/cib-35.raw
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_write_with_digest: Wrote version 0.26.0 of the CIB to disk (digest: 7f962ed676a49e84410eee2ee04bae8c)
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_write_with_digest: Reading cluster configuration file /var/lib/pacemaker/cib/cib.MnRP4u (digest: /var/lib/pacemaker/cib/cib.B5sWNH)
Apr 06 12:16:31 [17287] server_d.test.local cib: info: cib_process_ping: Reporting our current digest to server_d.test.local: 8182592cb4922cbf007158ab0a277190 for 0.26.0 (0x5575234afde0 0)
需要注意的是,如果我执行pcs cluster stop server_b.test.local
配置组内的所有资源都会移动到另一个节点。
这是怎么回事?就像我说的,它有效,从那以后没有进行任何更改。
先感谢您!
编辑:
pcs config
[root@server_a ~]# pcs config
Cluster Name: my_app_cluster
Corosync Nodes:
server_a.test.local server_d.test.local
Pacemaker Nodes:
server_a.test.local server_d.test.local
Resources:
Group: my_app
Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
Attributes: cidr_netmask=24 ip=10.116.63.49
Operations: monitor interval=10s timeout=20s (ClusterIP-monitor-interval-10s)
start interval=0s timeout=20s (ClusterIP-start-interval-0s)
stop interval=0s timeout=20s (ClusterIP-stop-interval-0s)
Resource: Apache (class=systemd type=httpd)
Operations: monitor interval=60 timeout=100 (Apache-monitor-interval-60)
start interval=0s timeout=100 (Apache-start-interval-0s)
stop interval=0s timeout=100 (Apache-stop-interval-0s)
Resource: stunnel (class=systemd type=stunnel-my_app)
Operations: monitor interval=60 timeout=100 (stunnel-monitor-interval-60)
start interval=0s timeout=100 (stunnel-start-interval-0s)
stop interval=0s timeout=100 (stunnel-stop-interval-0s)
Resource: my_app-daemon (class=systemd type=my_app)
Operations: monitor interval=60 timeout=100 (my_app-daemon-monitor-interval-60)
start interval=0s timeout=100 (my_app-daemon-start-interval-0s)
stop interval=0s timeout=100 (my_app-daemon-stop-interval-0s)
Stonith Devices:
Fencing Levels:
Location Constraints:
Resource: Apache
Enabled on: server_d.test.local (score:INFINITY) (role: Started) (id:cli-prefer-Apache)
Resource: ClusterIP
Enabled on: server_a.test.local (score:INFINITY) (role: Started) (id:cli-prefer-ClusterIP)
Resource: my_app-daemon
Enabled on: server_a.test.local (score:INFINITY) (role: Started) (id:cli-prefer-my_app-daemon)
Resource: stunnel
Enabled on: server_a.test.local (score:INFINITY) (role: Started) (id:cli-prefer-stunnel)
Ordering Constraints:
Colocation Constraints:
Ticket Constraints:
Alerts:
No alerts defined
Resources Defaults:
No defaults set
Operations Defaults:
No defaults set
Cluster Properties:
cluster-infrastructure: corosync
cluster-name: my_app_cluster
dc-version: 1.1.16-12.el7_4.8-94ff4df
have-watchdog: false
stonith-enabled: false
Quorum:
Options:
编辑2
当我运行时,crm_simulate -sL
我得到以下输出:
[root@server_a ~]# crm_simulate -sL
Current cluster status:
Online: [ server_a.test.local server_d.test.local ]
Resource Group: my_app
ClusterIP (ocf::heartbeat:IPaddr2): Started server_a.test.local
Apache (systemd:httpd): Started server_a.test.local
stunnel (systemd:stunnel-my_app): Started server_a.test.local
my_app-daemon (systemd:my_app): Started server_a.test.local
Allocation scores:
group_color: my_app allocation score on server_a.test.local: 0
group_color: my_app allocation score on server_d.test.local: 0
group_color: ClusterIP allocation score on server_a.test.local: 0
group_color: ClusterIP allocation score on server_d.test.local: INFINITY
group_color: Apache allocation score on server_a.test.local: 0
group_color: Apache allocation score on server_d.test.local: INFINITY
group_color: stunnel allocation score on server_a.test.local: INFINITY
group_color: stunnel allocation score on server_d.test.local: 0
group_color: my_app-daemon allocation score on server_a.test.local: INFINITY
group_color: my_app-daemon allocation score on server_d.test.local: 0
native_color: ClusterIP allocation score on server_a.test.local: INFINITY
native_color: ClusterIP allocation score on server_d.test.local: INFINITY
native_color: Apache allocation score on server_a.test.local: INFINITY
native_color: Apache allocation score on server_d.test.local: -INFINITY
native_color: stunnel allocation score on server_a.test.local: INFINITY
native_color: stunnel allocation score on server_d.test.local: -INFINITY
native_color: my_app-daemon allocation score on server_a.test.local: INFINITY
native_color: my_app-daemon allocation score on server_d.test.local: -INFINITY
Transition Summary:
接下来,我删除了所有资源并将它们添加回来(再次像以前一样 - 我已将其记录下来),并且在运行命令时crm_simulate -sL
我现在得到不同的结果:
[root@server_a ~]# crm_simulate -sL
Current cluster status:
Online: [ server_a.test.local server_d.test.local ]
Resource Group: my_app
ClusterIP (ocf::heartbeat:IPaddr2): Started server_a.test.local
Apache (systemd:httpd): Started server_a.test.local
stunnel (systemd:stunnel-my_app.service): Started server_a.test.local
my_app-daemon (systemd:my_app.service): Started server_a.test.local
Allocation scores:
group_color: my_app allocation score on server_a.test.local: 0
group_color: my_app allocation score on server_d.test.local: 0
group_color: ClusterIP allocation score on server_a.test.local: 0
group_color: ClusterIP allocation score on server_d.test.local: 0
group_color: Apache allocation score on server_a.test.local: 0
group_color: Apache allocation score on server_d.test.local: 0
group_color: stunnel allocation score on server_a.test.local: 0
group_color: stunnel allocation score on server_d.test.local: 0
group_color: my_app-daemon allocation score on server_a.test.local: 0
group_color: my_app-daemon allocation score on server_d.test.local: 0
native_color: ClusterIP allocation score on server_a.test.local: 0
native_color: ClusterIP allocation score on server_d.test.local: 0
native_color: Apache allocation score on server_a.test.local: 0
native_color: Apache allocation score on server_d.test.local: -INFINITY
native_color: stunnel allocation score on server_a.test.local: 0
native_color: stunnel allocation score on server_d.test.local: -INFINITY
native_color: my_app-daemon allocation score on server_a.test.local: 0
native_color: my_app-daemon allocation score on server_d.test.local: -INFINITY
我可以移动资源,但是当我这样做并crm_simulate -sL
再次运行命令时,它会得到与以前不同的输出!
[root@server_a ~]# crm_simulate -sL
Current cluster status:
Online: [ server_a.test.local server_d.test.local ]
Resource Group: my_app
ClusterIP (ocf::heartbeat:IPaddr2): Started server_d.test.local
Apache (systemd:httpd): Started server_d.test.local
stunnel (systemd:stunnel-my_app.service): Started server_d.test.local
my_app-daemon (systemd:my_app.service): Started server_d.test.local
Allocation scores:
group_color: my_app allocation score on server_a.test.local: 0
group_color: my_app allocation score on server_d.test.local: 0
group_color: ClusterIP allocation score on server_a.test.local: 0
group_color: ClusterIP allocation score on server_d.test.local: INFINITY
group_color: Apache allocation score on server_a.test.local: 0
group_color: Apache allocation score on server_d.test.local: 0
group_color: stunnel allocation score on server_a.test.local: 0
group_color: stunnel allocation score on server_d.test.local: 0
group_color: my_app-daemon allocation score on server_a.test.local: 0
group_color: my_app-daemon allocation score on server_d.test.local: 0
native_color: ClusterIP allocation score on server_a.test.local: 0
native_color: ClusterIP allocation score on server_d.test.local: INFINITY
native_color: Apache allocation score on server_a.test.local: -INFINITY
native_color: Apache allocation score on server_d.test.local: 0
native_color: stunnel allocation score on server_a.test.local: -INFINITY
native_color: stunnel allocation score on server_d.test.local: 0
native_color: my_app-daemon allocation score on server_a.test.local: -INFINITY
native_color: my_app-daemon allocation score on server_d.test.local: 0
Transition Summary:
我有点困惑:/这是预期的行为吗?
答案1
不确定我最后的答案是否正确,但我仔细查看后man pcs
发现:
move [destination node] [--master] [lifetime=] [--wait[=n]] 通过创建 -INFINITY 位置约束来禁止该节点,将资源移出当前正在运行的节点。如果指定了目标节点,则通过创建 INFINITY 位置约束以优先选择目标节点,资源将移动到该节点。如果使用 --master,则命令的范围仅限于主角色,并且您必须使用主 ID(而不是资源 ID)。如果指定了生存期,则约束将在该时间后过期,否则默认为无穷大,并且可以使用“pcs资源清除”或“pcs约束删除”手动清除约束。如果指定了 --wait,则 pcs 将等待最多“n”秒以等待资源移动,然后在成功时返回 0,在错误时返回 1。如果未指定“n”,则默认为 60 分钟。如果您希望资源最好避免在某些节点上运行,但能够故障转移到它们,请使用“pcs locationvoids”。
使用 pcs resource clear
清除了限制,我可以移动资源。
答案2
对所有分组资源的偏好限制score:INFINITY
可能是问题所在。INFINITY
实际上等于1,000,000
Pacemaker 中的 ,这是可以分配给分数的最高值。
使用时以下情况正确INFINITY
(来自 ClusterLabs 文档):
6.1.1. Infinity Math Pacemaker implements INFINITY (or equivalently, +INFINITY) internally as a score of 1,000,000. Addition and subtraction with it follow these three basic rules: Any value + INFINITY = INFINITY Any value - INFINITY = -INFINITY INFINITY - INFINITY = -INFINITY
1,000
尝试将您的偏好分数更改为、 或10,000
之类的值INFINITY
,然后再次运行测试。