My problem is about the pacemaker. For example,the pacemaker cluster has two resources, one of which is starting, such as needing for 3 minutes, then in this 3 minutes, if another resource monitor failed, it will not immediately call stop/start method to restart but waiting the first resource to starting complete. After the first resource start completely, the second resource begin restarting, does anyone know why?Thank you very much! My cluster version: corosync 2.3.4 pacemaker 1.1.13
Asked
Active
Viewed 1,544 times
-1
-
Please consider providing the configuration and referring to the resources specifically as named in the configuration. It is very difficult to infer reasons without configurations. – Dok Oct 24 '17 at 15:11
-
I have pasted cluster and resources configure at answers. Do you know the reason? I have been confused on this issue for several days. Thank you very much! – James Oct 26 '17 at 09:29
2 Answers
0
My cluster configure is as follows.And for debug,I have add "sleep 60" to function start of ocf.
crm configure show
node 168002177: 192.168.2.177
node 168002178: 192.168.2.178
node 168002179: 192.168.2.179
primitive fm_mgt fm_mgt \
op monitor interval=20s timeout=120s \
op stop interval=0 timeout=120s on-fail=restart \
op start interval=0 timeout=120s on-fail=restart \
meta target-role=Started
primitive logserver logserver \
op monitor interval=20s timeout=120s \
op stop interval=0 timeout=120s on-fail=restart \
op start interval=0 timeout=120s on-fail=restart \
meta target-role=Started
clone fm_mgt_replica fm_mgt
clone logserver_replica logserver
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.13-10.el7-44eb2dd \
cluster-infrastructure=corosync \
stonith-enabled=false \
start-failure-is-fatal=false

James
- 19
- 5
0
When I kill fm_mgt service on 177 node, and then kill logserver service on 177, fm_mgt start need at least one minite, in this minite, logserver will not be restarted until fm_mgt recovery completely.
crm status
Last updated: Thu Oct 26 06:40:24 2017 Last change: Thu Oct 26 06:36:33 2017 by root via crm_resource on 192.168.2.177
Stack: corosync
Current DC: 192.168.2.179 (version 1.1.13-10.el7-44eb2dd) - partition with quorum
3 nodes and 6 resources configured
Online: [ 192.168.2.177 192.168.2.178 192.168.2.179 ]
Full list of resources:
Clone Set: logserver_replica [logserver]
logserver (ocf::heartbeat:logserver): FAILED 192.168.2.177
Started: [ 192.168.2.178 192.168.2.179 ]
Clone Set: fm_mgt_replica [fm_mgt]
Started: [ 192.168.2.178 192.168.2.179 ]
Stopped: [ 192.168.2.177 ]

James
- 19
- 5