por alguma razão, não sou mais capaz de mover recursos com pcs
pacemaker-1.1.16-12.el7_4.8.x86_64
corosync-2.4.0-9.el7_4.2.x86_64
pcs-0.9.158-6.el7.centos.1.x86_64
Linux server_a.test.local 3.10.0-693.el7.x86_64
Eu tenho 4 recursos configurados como parte do grupo de recursos. Aqui está o registro da ação quando tentei mover o recurso ClusterIP
de server_d
para server_a
usando pcs resource move ClusterIP servr_a.test.local
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_process_request: Forwarding cib_delete operation for section constraints to all (origin=local/crm_resource/3)
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: Diff: --- 0.24.0 2
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: Diff: +++ 0.25.0 (null)
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: -- /cib/configuration/constraints/rsc_location[@id='cli-prefer-ClusterIP']
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: + /cib: @epoch=25
Apr 06 12:16:26 [17292] server_d.test.local crmd: info: abort_transition_graph: Transition aborted by deletion of rsc_location[@id='cli-prefer-ClusterIP']: Configuration change | cib=0.25.0 source=te_update_diff:456 path=/cib/configuration/constraints/rsc_location[@id='cli-prefer-ClusterIP'] complete=true
Apr 06 12:16:26 [17292] server_d.test.local crmd: notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE | input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_process_request: Completed cib_delete operation for section constraints: OK (rc=0, origin=server_d.test.local/crm_resource/3, version=0.25.0)
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: determine_online_status: Node server_a.test.local is online
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: determine_online_status: Node server_d.test.local is online
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 1 is already processed
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 2 is already processed
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 1 is already processed
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 2 is already processed
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: group_print: Resource Group: my_app
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: common_print: ClusterIP (ocf::heartbeat:IPaddr2): Started server_d.test.local
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: common_print: Apache (systemd:httpd): Started server_d.test.local
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: common_print: stunnel (systemd:stunnel-my_app): Started server_d.test.local
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: common_print: my_app-daemon (systemd:my_app): Started server_d.test.local
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: LogActions: Leave ClusterIP (Started server_d.test.local)
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: LogActions: Leave Apache (Started server_d.test.local)
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: LogActions: Leave stunnel (Started server_d.test.local)
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: LogActions: Leave my_app-daemon (Started server_d.test.local)
Apr 06 12:16:26 [17291] server_d.test.local pengine: notice: process_pe_message: Calculated transition 8, saving inputs in /var/lib/pacemaker/pengine/pe-input-18.bz2
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_process_request: Forwarding cib_modify operation for section constraints to all (origin=local/crm_resource/4)
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: Diff: --- 0.25.0 2
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: Diff: +++ 0.26.0 (null)
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: + /cib: @epoch=26
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: ++ /cib/configuration/constraints: <rsc_location id="cli-prefer-ClusterIP" rsc="ClusterIP" role="Started" node="server_a.test.local" score="INFINITY"/>
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_process_request: Completed cib_modify operation for section constraints: OK (rc=0, origin=server_d.test.local/crm_resource/4, version=0.26.0)
Apr 06 12:16:26 [17292] server_d.test.local crmd: info: abort_transition_graph: Transition aborted by rsc_location.cli-prefer-ClusterIP 'create': Configuration change | cib=0.26.0 source=te_update_diff:456 path=/cib/configuration/constraints complete=true
Apr 06 12:16:26 [17292] server_d.test.local crmd: info: handle_response: pe_calc calculation pe_calc-dc-1523016986-67 is obsolete
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: determine_online_status: Node server_a.test.local is online
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: determine_online_status: Node server_d.test.local is online
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 1 is already processed
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 2 is already processed
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 1 is already processed
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 2 is already processed
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: group_print: Resource Group: my_app
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: common_print: ClusterIP (ocf::heartbeat:IPaddr2): Started server_d.test.local
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: common_print: Apache (systemd:httpd): Started server_d.test.local
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: common_print: stunnel (systemd:stunnel-my_app): Started server_d.test.local
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: common_print: my_app-daemon (systemd:my_app): Started server_d.test.local
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: LogActions: Leave ClusterIP (Started server_d.test.local)
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: LogActions: Leave Apache (Started server_d.test.local)
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: LogActions: Leave stunnel (Started server_d.test.local)
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: LogActions: Leave my_app-daemon (Started server_d.test.local)
Apr 06 12:16:27 [17291] server_d.test.local pengine: notice: process_pe_message: Calculated transition 9, saving inputs in /var/lib/pacemaker/pengine/pe-input-19.bz2
Apr 06 12:16:27 [17292] server_d.test.local crmd: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response
Apr 06 12:16:27 [17292] server_d.test.local crmd: info: do_te_invoke: Processing graph 9 (ref=pe_calc-dc-1523016987-68) derived from /var/lib/pacemaker/pengine/pe-input-19.bz2
Apr 06 12:16:27 [17292] server_d.test.local crmd: notice: run_graph: Transition 9 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-19.bz2): Complete
Apr 06 12:16:27 [17292] server_d.test.local crmd: info: do_log: Input I_TE_SUCCESS received in state S_TRANSITION_ENGINE from notify_crmd
Apr 06 12:16:27 [17292] server_d.test.local crmd: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE | input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_backup: Archived previous version as /var/lib/pacemaker/cib/cib-34.raw
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_write_with_digest: Wrote version 0.25.0 of the CIB to disk (digest: 7511cba55b6c2f2f481a51d5585b8d36)
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_write_with_digest: Reading cluster configuration file /var/lib/pacemaker/cib/cib.tPIv7m (digest: /var/lib/pacemaker/cib/cib.OwHiKz)
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_backup: Archived previous version as /var/lib/pacemaker/cib/cib-35.raw
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_write_with_digest: Wrote version 0.26.0 of the CIB to disk (digest: 7f962ed676a49e84410eee2ee04bae8c)
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_write_with_digest: Reading cluster configuration file /var/lib/pacemaker/cib/cib.MnRP4u (digest: /var/lib/pacemaker/cib/cib.B5sWNH)
Apr 06 12:16:31 [17287] server_d.test.local cib: info: cib_process_ping: Reporting our current digest to server_d.test.local: 8182592cb4922cbf007158ab0a277190 for 0.26.0 (0x5575234afde0 0)
Importante notar que, se eu executar pcs cluster stop server_b.test.local
, todos os recursos dentro do grupo configure são movidos para o outro nó.
O que está acontecendo? Como eu disse, funcionou e nenhuma mudança foi feita desde então.
Obrigado antecipadamente!
EDITAR:
pcs config
[root@server_a ~]# pcs config
Cluster Name: my_app_cluster
Corosync Nodes:
server_a.test.local server_d.test.local
Pacemaker Nodes:
server_a.test.local server_d.test.local
Resources:
Group: my_app
Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
Attributes: cidr_netmask=24 ip=10.116.63.49
Operations: monitor interval=10s timeout=20s (ClusterIP-monitor-interval-10s)
start interval=0s timeout=20s (ClusterIP-start-interval-0s)
stop interval=0s timeout=20s (ClusterIP-stop-interval-0s)
Resource: Apache (class=systemd type=httpd)
Operations: monitor interval=60 timeout=100 (Apache-monitor-interval-60)
start interval=0s timeout=100 (Apache-start-interval-0s)
stop interval=0s timeout=100 (Apache-stop-interval-0s)
Resource: stunnel (class=systemd type=stunnel-my_app)
Operations: monitor interval=60 timeout=100 (stunnel-monitor-interval-60)
start interval=0s timeout=100 (stunnel-start-interval-0s)
stop interval=0s timeout=100 (stunnel-stop-interval-0s)
Resource: my_app-daemon (class=systemd type=my_app)
Operations: monitor interval=60 timeout=100 (my_app-daemon-monitor-interval-60)
start interval=0s timeout=100 (my_app-daemon-start-interval-0s)
stop interval=0s timeout=100 (my_app-daemon-stop-interval-0s)
Stonith Devices:
Fencing Levels:
Location Constraints:
Resource: Apache
Enabled on: server_d.test.local (score:INFINITY) (role: Started) (id:cli-prefer-Apache)
Resource: ClusterIP
Enabled on: server_a.test.local (score:INFINITY) (role: Started) (id:cli-prefer-ClusterIP)
Resource: my_app-daemon
Enabled on: server_a.test.local (score:INFINITY) (role: Started) (id:cli-prefer-my_app-daemon)
Resource: stunnel
Enabled on: server_a.test.local (score:INFINITY) (role: Started) (id:cli-prefer-stunnel)
Ordering Constraints:
Colocation Constraints:
Ticket Constraints:
Alerts:
No alerts defined
Resources Defaults:
No defaults set
Operations Defaults:
No defaults set
Cluster Properties:
cluster-infrastructure: corosync
cluster-name: my_app_cluster
dc-version: 1.1.16-12.el7_4.8-94ff4df
have-watchdog: false
stonith-enabled: false
Quorum:
Options:
EDIT2
Quando eu executo crm_simulate -sL
, recebo a seguinte saída:
[root@server_a ~]# crm_simulate -sL
Current cluster status:
Online: [ server_a.test.local server_d.test.local ]
Resource Group: my_app
ClusterIP (ocf::heartbeat:IPaddr2): Started server_a.test.local
Apache (systemd:httpd): Started server_a.test.local
stunnel (systemd:stunnel-my_app): Started server_a.test.local
my_app-daemon (systemd:my_app): Started server_a.test.local
Allocation scores:
group_color: my_app allocation score on server_a.test.local: 0
group_color: my_app allocation score on server_d.test.local: 0
group_color: ClusterIP allocation score on server_a.test.local: 0
group_color: ClusterIP allocation score on server_d.test.local: INFINITY
group_color: Apache allocation score on server_a.test.local: 0
group_color: Apache allocation score on server_d.test.local: INFINITY
group_color: stunnel allocation score on server_a.test.local: INFINITY
group_color: stunnel allocation score on server_d.test.local: 0
group_color: my_app-daemon allocation score on server_a.test.local: INFINITY
group_color: my_app-daemon allocation score on server_d.test.local: 0
native_color: ClusterIP allocation score on server_a.test.local: INFINITY
native_color: ClusterIP allocation score on server_d.test.local: INFINITY
native_color: Apache allocation score on server_a.test.local: INFINITY
native_color: Apache allocation score on server_d.test.local: -INFINITY
native_color: stunnel allocation score on server_a.test.local: INFINITY
native_color: stunnel allocation score on server_d.test.local: -INFINITY
native_color: my_app-daemon allocation score on server_a.test.local: INFINITY
native_color: my_app-daemon allocation score on server_d.test.local: -INFINITY
Transition Summary:
Em seguida, apaguei todos os recursos e os adicionei de volta (novamente como antes - tenho isso documentado) e, ao executar o comando crm_simulate -sL
, obtenho agora resultados diferentes:
[root@server_a ~]# crm_simulate -sL
Current cluster status:
Online: [ server_a.test.local server_d.test.local ]
Resource Group: my_app
ClusterIP (ocf::heartbeat:IPaddr2): Started server_a.test.local
Apache (systemd:httpd): Started server_a.test.local
stunnel (systemd:stunnel-my_app.service): Started server_a.test.local
my_app-daemon (systemd:my_app.service): Started server_a.test.local
Allocation scores:
group_color: my_app allocation score on server_a.test.local: 0
group_color: my_app allocation score on server_d.test.local: 0
group_color: ClusterIP allocation score on server_a.test.local: 0
group_color: ClusterIP allocation score on server_d.test.local: 0
group_color: Apache allocation score on server_a.test.local: 0
group_color: Apache allocation score on server_d.test.local: 0
group_color: stunnel allocation score on server_a.test.local: 0
group_color: stunnel allocation score on server_d.test.local: 0
group_color: my_app-daemon allocation score on server_a.test.local: 0
group_color: my_app-daemon allocation score on server_d.test.local: 0
native_color: ClusterIP allocation score on server_a.test.local: 0
native_color: ClusterIP allocation score on server_d.test.local: 0
native_color: Apache allocation score on server_a.test.local: 0
native_color: Apache allocation score on server_d.test.local: -INFINITY
native_color: stunnel allocation score on server_a.test.local: 0
native_color: stunnel allocation score on server_d.test.local: -INFINITY
native_color: my_app-daemon allocation score on server_a.test.local: 0
native_color: my_app-daemon allocation score on server_d.test.local: -INFINITY
E eu sou capaz de mover recursos, mas quando executo e executo o comando crm_simulate -sL
, ele obtém uma saída diferente da anterior!
[root@server_a ~]# crm_simulate -sL
Current cluster status:
Online: [ server_a.test.local server_d.test.local ]
Resource Group: my_app
ClusterIP (ocf::heartbeat:IPaddr2): Started server_d.test.local
Apache (systemd:httpd): Started server_d.test.local
stunnel (systemd:stunnel-my_app.service): Started server_d.test.local
my_app-daemon (systemd:my_app.service): Started server_d.test.local
Allocation scores:
group_color: my_app allocation score on server_a.test.local: 0
group_color: my_app allocation score on server_d.test.local: 0
group_color: ClusterIP allocation score on server_a.test.local: 0
group_color: ClusterIP allocation score on server_d.test.local: INFINITY
group_color: Apache allocation score on server_a.test.local: 0
group_color: Apache allocation score on server_d.test.local: 0
group_color: stunnel allocation score on server_a.test.local: 0
group_color: stunnel allocation score on server_d.test.local: 0
group_color: my_app-daemon allocation score on server_a.test.local: 0
group_color: my_app-daemon allocation score on server_d.test.local: 0
native_color: ClusterIP allocation score on server_a.test.local: 0
native_color: ClusterIP allocation score on server_d.test.local: INFINITY
native_color: Apache allocation score on server_a.test.local: -INFINITY
native_color: Apache allocation score on server_d.test.local: 0
native_color: stunnel allocation score on server_a.test.local: -INFINITY
native_color: stunnel allocation score on server_d.test.local: 0
native_color: my_app-daemon allocation score on server_a.test.local: -INFINITY
native_color: my_app-daemon allocation score on server_d.test.local: 0
Transition Summary:
Eu estou meio confuso: / É um comportamento esperado?