I have three nodes and three virtual IP addresses in the same subnet. It doesn't matter which address is assigned to which node. The IPs are public addresses. The nodes are load balancers that distribute web requests to several backend systems.
In case all three nodes are online the three addresses would be balanced on the three nodes so that every node has exactly one address.
If only two of the nodes are online, one node has one address, the other one has two addresses.
If only one node is online it has all three addresses.
This works fine with this resource configuration:
primitive IP_10 ocf:heartbeat:IPaddr2 params ip="x.x.x.10" cidr_netmask="24" nic="eth1" primitive IP_11 ocf:heartbeat:IPaddr2 params ip="x.x.x.11" cidr_netmask="24" nic="eth1" primitive IP_12 ocf:heartbeat:IPaddr2 params ip="x.x.x.12" cidr_netmask="24" nic="eth1"
Now the thing is that I have one default gateway (x.x.x.1) which needs to be set up on every node after assigning one of the IP addresses. It is obviously not possible to set up the default gateway before the address is assigned.
First thing I tried is to set up a second resource for every address, ocf:heartbeat:Route:
primitive default_gw_1 ocf:heartbeat:Route params destination="default" device="eth1" gateway="x.x.x.1" primitive default_gw_2 ocf:heartbeat:Route params destination="default" device="eth1" gateway="x.x.x.1" primitive default_gw_3 ocf:heartbeat:Route params destination="default" device="eth1" gateway="x.x.x.1"
and then combine these resources into groups:
group net_10 IP_10 default_gw_1 group net_11 IP_11 default_gw_2 group net_12 IP_12 default_gw_3
That works so far, the default gateway gets set correctly after assigning the address. In the failover case things still work as expected. This example shows a possible distribution of resources after Node1 went offline:
Node1: offline Node2: net_10, net_11 Node3: net_12
Problems arise when Node1 comes back online. One of the resources on Node2, say net_10, will then be migrated to Node1. Now the ocf:heartbeat:Route resource manager is stopped on Node2, deletes the default gateway and effectively ceases access to Node2 as it has no default gateway any more. So the remaining address on Node2 (net_11) is no longer reachable.
I then tried to patch the ocf:heartbeat:Route resource manager to disable deleting of the route. That seems to work but feels very ugly.
I assume there must be a better solution for this. How can I configure things so that the default gateway remains set on a node as long as at least one IP address is assigned to that node?
(Pacemaker 1.1.7 on Debian Wheezy)