pacemaker failover nginx only once

Question

I setup a cluster with two nodes with pacemaker 1.1.10 on CentOS 7. Then I downloaded a resource agent for nginx from github

I tested my setup like this:

Node 1 is started with the nginx and vip, everyting is ok
Kill Node1 nginx, wait for a few seconds
See the ngnix and vip are moved to node2, failover succeeded, and Node1 doesn't have any resources active
I kill nginx on node2, but nginx and vip don't come back to Node1

I set no-quorum-policy="ignore" and stonith-enabled="false".

Why won't pacemaker let the resource come back to Node1? What did I miss here?

I tried to put VIP and Nginx into a resource group, after second time I killed the nginx, the entire cluster shut down. — jacob, Aug 08 '15 at 12:45
Notice that after I killed the node1 and the I run the pcs status, I got some failed action in the output. nginx_monitor_60000 on node1 'not running' (7): call=11, status=complete, last-rc-change='Sun Aug 9 12:34:47 2015', queued=0ms, exec=0ms — jacob, Aug 09 '15 at 05:10
I think I solved my problem by below command. pcs property --force set migration-threshold=1 pcs property --force set failure-timeout=15s pcs property --force set cluster-recheck-interval=30s — jacob, Aug 09 '15 at 07:18

score 1 · Answer 1 · answered Aug 11 '15 at 05:07

It doesn't move because "change node" is not necessarily implied as part of recovering the resource. Apparently the cluster thinks node2 is the best place for them.

Use migration-threshold and failure-timeout to control when resources need to be moved away and when they can come back. Also note that a failed start operation is one case where we will definitely move the resource away.

Best to ask these sorts of questions on the upstream mailing list where we can ask for more information (like logs). See http://clusterlabs.org/help.html

pacemaker failover nginx only once

1 Answers1