4

I need a server / switch / something (which I'll refer to as a "failover switch") to do the following task.

  1. There are two HTTP servers behind the failover switch.
    • Normally, all requests are routed to server 1.
    • If server 1 dies (stops responding on HTTP requests in a timely fashion), all requests are routed to server 2.
    • When server 1 is restored (starts responding on HTTP requests in a timely fashion), all requests are routed back to server 1.
    • The failover switch has one external IP address (which the requests initially come to).

This thing should be as dumb and simple as possible.

How is it properly called and what would be an example of such a thing?

Update:

Ideally, this should not have any moving parts (like HDD or a fan). So a dedicated hardware solution would be preferred.

Update 2:

This thing should be a switch, not a balancer. It should switch based on HTTP server availability, not ARP or ICMP.

Quassnoi
  • 233
  • 1
  • 7

7 Answers7

11

What you want is generally called a load balancer. Load balancers can either come in the form of an appliance, F5 Big-IP boxes for instance, or in the form of a load balancing application running on top of commodity hardware. One example of this is HAProxy, which coincidentally, is currently doing the load balancing for all of the StackExchange sites.

EEAA
  • 109,363
  • 18
  • 175
  • 245
  • You can do this type of thing within software as well (though I much prefer a hardware solution as above) – AliGibbs Nov 09 '10 at 17:20
  • Yes, I mentioned a software solution in my answer. – EEAA Nov 09 '10 at 17:28
  • +1 Just wanted to add a note that for the relatively simple use case listed in the OP, there are a lot of cheaper hardware solutions than an F5, such as a Webmux or similar, as long as it meet the OP's needs. – phoebus Nov 09 '10 at 17:40
  • @phoebus - agreed. The F5 was just the first one that came to mind. – EEAA Nov 09 '10 at 17:41
1

If you don't want load balancing, all you need is a simple failover clustering solution which sends heartbeats between the cluster members and assigns ownership of the virtual IP that you'll be using to access the service. There's dozens of them out there.

On Windows, the de facto answer is Microsoft Clustering Services (MSCS), which is included with Enterprise and Datacenter editions. On Linux/Unix, you probably want to be looking into Heartbeat (simple, but not super-configurable), Pacemaker (robust, but a bit more complex), or another similar system.

jgoldschrafe
  • 4,395
  • 18
  • 18
1

There are some opensource solutions here that will help.

The first one I would check out would be PFSENSE

pfSense will run on virtually any hardware - as well as in a vps. the system will allow for load balancing, clustering, as well as a variety of other services

pf uses the idea of the "carp" method for load balancing and fail over- very easy to setup.

Glenn Kelley
  • 1,294
  • 6
  • 10
0

Another way of accomplishing this is on Alcatel-Lucent Omni-Switches there is a feature called Server Load Balancing. This will do exactly what you are looking for but instead of doing this at the ip layer it does it at the MAC address layer.

Nick Zepp
  • 239
  • 2
  • 3
  • Does it understand that the server stopped responding on `HTTP` requests (if it still responds to `ARP` or `ICMP`) – Quassnoi Nov 09 '10 at 16:37
  • Yeah, you can set the SLB up to query for services (Http,Https,FTP,DNS,or SMTP by default) and then failover based on service availability. It can also be triggered by physical link state on an interface of the switch or an ICMP response code. – Nick Zepp Nov 09 '10 at 19:26
0

What you're looking at doing isn't necessarily simple to implement and takes some monitoring/maintenance.

I don't know of anything that's not a balancer as a hardware solution. You'll probably need to look at clustering, which means running a form of fencing and heartbeat so that the servers monitor each other.

If possible you'd probably want to use shared storage with web server front ends.

That said, if you're using Linux, you'll probably want to google high availability Linux, which should point you to resources on configuring heartbeat software that monitors services on the servers and if one goes down it brings up the other server to take over. Once the other server comes back up, you can transition service requests back to the primary server.

The reason I said this isn't simple is because you may have a split brain situation come up. Make sure you google that and plan accordingly.

You'll also want hardware set up to talk to each other on a dedicated high speed connection between the servers and you have to maintain a shared storage share of some kind so you don't need to sync files.

So if you don't want to take the balancer route, you'll have to dedicate some time and effort in running a cluster. I don't know much about Windows clustering, but if that's what you're running you'll have some options built into the higher-end versions of Windows for clustering services; I think it's in Enterprise and Data Center editions. You'll have to browse Microsoft's marketing drones site for more information on that.

Good luck!

Bart Silverstrim
  • 31,172
  • 9
  • 67
  • 87
0

I use pound (http://www.apsis.ch/pound) on Linux for this. You can easily install it into a small appliance-style chassis. From their website:

WHAT POUND IS:

  1. a reverse-proxy: it passes requests from client browsers to one or more back-end servers. [...]
  2. a fail over-server: should a back-end server fail, Pound will take note of the fact and stop passing requests to it until it recovers.

Its benefits include truly tiny, concise configuration files and small code and system footprints. You can have it up, running and verified in under 10 minutes.

Ttamsen
  • 13
  • 1
  • 1
  • 3
0

Typically, that sort of thing is done with a load balancer in a larger environment, as the LB appliances tend towards being relatively stable and not prone to falling over. For the specific scenario you're looking at, you'd simply set one as "100%" and the other as "0%" of the load, with the second taking over if the first one goes away.

If you're looking at a software-based solution, it would depend on what OS you're using. I built a solution around Ultramonkey and Zebra, back in the days (now part of the HA-Linux suite, I believe), but I was doing fail-over between London and Palo Alto, so I did need a routing component as well.

Vatine
  • 5,440
  • 25
  • 24