1

I am encountering an issue with colocation constraints.

I have created a 4 nodes cluster (3 "main" and 1 "spare") with 3 resources, each resource should run only on its own node or on the spare and never together on the spare.

When the resources are created with the adequate priorities they are indeed running on each "main" node as expected.

If I add one colocation constraint (resource3 cannot run with resource2), resources remain correctly on their node.

But as soon as I add a second colocation constraint (resource2 cannot run with resource1), resource1 switches to the spare node and I cannot understand why.

Can somebody explain this behavior ?

Resource setup :

pcs property set symmetric-cluster=false

pcs resource create TestResourceNode1 ocf:pacemaker:Dummy op monitor interval=120s
pcs constraint location TestResourceNode1 prefers node1=100
pcs constraint location TestResourceNode1 prefers nodespare=80

pcs resource create TestResourceNode2 ocf:pacemaker:Dummy op monitor interval=120s
pcs constraint location TestResourceNode2 prefers node2=50
pcs constraint location TestResourceNode2 prefers nodespare=30

pcs resource create TestResourceNode3 ocf:pacemaker:Dummy op monitor interval=120s
pcs constraint location TestResourceNode3 prefers node3=10
pcs constraint location TestResourceNode3 prefers nodespare=1

Constraint setup :

pcs constraint colocation add TestResourceNode3 with TestResourceNode2 score=-INFINITY
# OK, resources are still running on node1, node2, node3
pcs constraint colocation add TestResourceNode2 with TestResourceNode1 score=-INFINITY
# KO, resource TestResourceNode1 has moved to nodespare, why ???
JohnLoopM
  • 161
  • 1
  • 7

2 Answers2

1

If you look at the output of crm_mon, you'll notice that a single node is running as the cluster's DC. This is the node who is currently running Pacemaker's policy engine (pengine). You should see messages in the logs (/var/log/messages or /var/log/syslog) at the time the resource was moved that look something like this:

pengine[6132]:   notice: process_pe_message: Calculated Transition 7: /var/lib/pacemaker/pengine/pe-input-4424.bz2

You can inspect these policy engine input files to see "what the cluster was thinking" when it performed those actions using the crm_simulate utility. It likely has something to do with resource scores, so I would start by checking those out:

$ crm_simulate -s -x /var/lib/pacemaker/pengine/pe-input-4424.bz2

And then inspect the pe-input files surrounding it to understand the changes your resource's preference scores and constraints had on the policy engine.

Debugging Pacemaker's policy engine can be tricky. I would recommend tuning/testing preference scores before you spend too much time with crm_simulate. Maybe "heavier" resource scores like this just work:

pcs resource create TestResourceNode1 ocf:pacemaker:Dummy op monitor interval=120s
pcs constraint location TestResourceNode1 prefers node1=10000
pcs constraint location TestResourceNode1 prefers nodespare=800

pcs resource create TestResourceNode2 ocf:pacemaker:Dummy op monitor interval=120s
pcs constraint location TestResourceNode2 prefers node2=5000
pcs constraint location TestResourceNode2 prefers nodespare=300

pcs resource create TestResourceNode3 ocf:pacemaker:Dummy op monitor interval=120s
pcs constraint location TestResourceNode3 prefers node3=1000
pcs constraint location TestResourceNode3 prefers nodespare=10

Hope that helps!

Matt Kereczman
  • 1,899
  • 9
  • 12
  • 1
    Thank you for your input. After contact with the pacemaker developers it appears that transitive colocation constraints with -INF score are not well supported resulting in invalide location scores. https://bugs.clusterlabs.org/show_bug.cgi?id=5320 – JohnLoopM Mar 29 '19 at 07:59
1

To anyone looking for a solution to a similar problem.

Transitive colocation constraints for resources with -INF scores (example : r1 with r2 -INF and r2 with r3 -INF) result in invalid placement. See https://bugs.clusterlabs.org/show_bug.cgi?id=5320.

One workaround is to assign utilization constraints to resources so as to limit their simultaneous placement on a single node.

Example configuration :

# Opt-in cluster, ressources will not run anywhere by default
pcs property set symmetric-cluster=false
# Set placement strategy to utilization
pcs property set placement-strategy=utilization

pcs resource create TestResourceNode1 ocf:pacemaker:Dummy op monitor interval=120s
pcs constraint location TestResourceNode1 prefers node1=100
pcs constraint location TestResourceNode1 prefers nodespare=80
crm_resource --meta --resource TestResourceNode1 --set-parameter priority --parameter-value 100

pcs resource create TestResourceNode2 ocf:pacemaker:Dummy op monitor interval=120s
pcs constraint location TestResourceNode2 prefers node2=50
pcs constraint location TestResourceNode2 prefers nodespare=30
crm_resource --meta --resource TestResourceNode2 --set-parameter priority --parameter-value 50

pcs resource create TestResourceNode3 ocf:pacemaker:Dummy op monitor interval=120s
pcs constraint location TestResourceNode3 prefers node3=10
pcs constraint location TestResourceNode3 prefers nodespare=3
crm_resource --meta --resource TestResourceNode3 --set-parameter priority --parameter-value 10

pcs node utilization node1 cpu=1 memory=1000
pcs node utilization node2 cpu=1 memory=1000
pcs node utilization node3 cpu=1 memory=1000
pcs node utilization nodespare cpu=1 memory=1000

pcs resource utilization TestResourceNode1 cpu=1 memory=1000
pcs resource utilization TestResourceNode2 cpu=1 memory=1000
pcs resource utilization TestResourceNode3 cpu=1 memory=1000
JohnLoopM
  • 161
  • 1
  • 7