I'm trying to discover if there are any methods to have more than 1 default gateway (in a fail-over strategy) on a single subnet. If one gateway goes down then I would like to have servers fail-over to a backup default gateway. I've looked into using CARP but it isn't compatible with cloud environments.
-
1What do you mean by "isn't compatible with cloud environments"? There are plenty of redundancy and/or load sharing mechanisms - HSRP, VRRP, GLBP.. but if CARP won't work, then those might not either. Can you clarify what your needs are? – Shane Madden Sep 12 '12 at 02:26
-
I work in an environment where compute instances are spun up and spun down quite frequently. Instances may sometimes disappear as well. If a gateway instance dies I require servers inside the private cloud to be able to access the internet. – Justin Sep 12 '12 at 08:49
2 Answers
Gateway redundancy is usually done by a backup gateway doing an IP takeover when the first gateway goes down. This usually involves actively monitoring the gateway to ensure it is up. You will redundancy on the other side of the gateway as well or the far side of the gateway will remain a single point of failure.
Redundancy is still vulnerable to single points of failure if there is anything shared between the routes. This can be be a cable, fiber optic bundle, power source, or something else. I've lost connectivity on redundant links when a fiber optic cable was cut miles from our site. It was cut a few feet from where the fiber routes split.
Given the reliability of current gateway equipment, I would rely on a single gateway in most cases.

- 27,737
- 3
- 37
- 69
-
The current gateway equipment is a linux instance in a private cloud which, unfortunately, is far from reliable. Instances can often die in cloud environments. – Justin Sep 12 '12 at 16:45
-
For a case like that, I would configure a second copy of the instance and configure it to do an IP takeover when the first dies. You may want to have the primary and secondary generate a notification if the other instance dies or disappears. – BillThor Sep 13 '12 at 01:49
All of the protocols mentioned above (HSRP, VRRP, CARP, GLBP) should fully support your needs.
They're all based on the concept of "membership in a group," which is normally managed via multicast advertisements. Generally, the group providing the redundant IP address is able to be joined and left by nodes on the fly, with one member providing service on behalf of the group, or dozens of members. The loss or disappearance of a node triggers a takeover of the shared address by the remaining nodes in the group.

- 114,520
- 13
- 181
- 251
-
Instances do not and cannot control the private IP addresses assigned to them as they're managed by the software that runs the private cloud. Once I've provisioned a new instance with an IP in our subnet I must terminate that instance before that IP address becomes free. I can't reassign an unused IP address to an existing instance either. I apologize for my vague description of why CARP won't work; I was quite tired when I wrote it. – Justin Sep 12 '12 at 22:37
-
@Justin Would you be able to keep systems off of a pre-assigned gateway address (`x.x.x.1`)? – Shane Madden Sep 12 '12 at 22:53
-
-
@Justin How are you supposed to assign a gateway address at all, then? A network router couldn't safely sit on a gateway IP? – Shane Madden Sep 13 '12 at 19:03
-
I can't reserve an IP address for a gateway; I can request an instance which will be provisioned with an IP from the pool. If a gateway goes down and I spin up a replacement, configuration management will replace the current default route with the new default route. It takes time for this to happen though and I'm trying to minimize downtime. – Justin Sep 17 '12 at 23:10
-
@Justin That means you need a configuration management server (and all of the supporting infrastructure; DNS, version control, etc) on the same subnet as these systems, and another copy of that infrastructure if you add another subnet? What cloud platform is this? I don't think there's any feasible way for any redundancy or clustering mechanism of any type to function under that structure. The point of those systems is that the application has the logic to manage the IP; if your platform makes it impossible for an application to do that, then you might be out of luck. – Shane Madden Sep 17 '12 at 23:43