3

I have a website that we load-balance across a few machines. The load-balancer (a Brocade ServerIron ADX) is on the local network. I know it has the capability to configure a "backup" ip address to use as the "real", but it would need to be on a local network. How can I provide a "fail whale" to my users and a status update message when my entire environment may be offline due to a FW, LB, or multiple-server failure? Since DNS doesn't have the ability to provide a weighted A-Record backup (like SRV records), what are my options?

Ideally, the solution should be able to step in automatically as soon as my site becomes unreachable and inform my users that our response teams on working on the problem.

Eric Falsken
  • 647
  • 2
  • 8
  • 21

4 Answers4

7

Twitter's Fail Whale is more complicated then it seems. The Twitter applications stacks (Stacks-- there are many layers in the Twitter infrastructure) are known to be quite complicated. Twitter has thousands of machines, years worth of code written in various languages, glue in a dozen different flavors, and hundreds (if not thousands) of places where the application can break. Your requirements (two servers, and a failure page) is much simpler.

I just happen to be looking at a similar feature-- I want to use a backup webserver if all of the primary webservers fail. Note that this will only help in some failure scenarios, and won't help if the public network to the loadbalancer has problems.

ServerIron ADX Server Load Balancing Guide, Chapter 2: section "Primary and backup servers" says:

Primary and backup servers

The real server is either a primary server or a backup server based on how you added it:

• A primary server is used by the ServerIron ADX when load balancing client requests for an application. It is a locally attached server added using the server real-name-or-ip command or Web equivalent.

• A backup server is used by the ServerIron ADX only if all the primary servers are unavailable for the requested application. It is remotely attached added using the server remote-name command or Web equivalent

Stefan Lasiewski
  • 23,667
  • 41
  • 132
  • 186
  • Will that work if the backup server is in the cloud or on a different network? We only have 2 servers, and I need a solution for when both of them are not working properly. – Eric Falsken Mar 03 '13 at 02:59
  • I believe so, and our network admin believes it will work, but we haven't tried it yet and don't know for sure. Best thing is to try and let us know. You should check with Brocade Tech Support. – Stefan Lasiewski Mar 03 '13 at 04:32
6

You get a web designer to produce a document which looks like you want it to look.

Then you configure the load balancer to serve this document for HTTP 500, 502, 503 and 504 errors. Exactly how you do that varies; check your documentation.

Some load balancers are just "dumb" TCP balancers which forward TCP connections and do nothing at layer 7. Others can act as full reverse proxies (e.g. nginx) and these are capable of doing what you need.

From a quick scan of the Brocade ServerIron ADX documentation, it doesn't seem capable of serving error documents on HTTP requests. While it's definitely smarter than your typical "dumb" TCP forwarder, it's probably not going to do what you want here.

A reverse proxy such as nginx would be capable of this, though if you set that up, you may as well just replace the load balancer (since nginx can do HTTP/HTTPS load balancing as well).

Michael Hampton
  • 244,070
  • 43
  • 506
  • 972
  • Can load balancers serve pages/documents in this manner Doesn't it require an HTTP server somewhere to service the request? – Eric Falsken Mar 01 '13 at 23:27
  • 1
    Since you didn't specify what you were using, it's impossible to be specific. Check your documentation. – Michael Hampton Mar 01 '13 at 23:29
  • it's a Brocade ServerIron ADX. (added to the OP) – Eric Falsken Mar 01 '13 at 23:37
  • 1
    Keep in mind that your loadbalancer *is* a HTTP server of sorts. It responds to HTTP requests, and might be able to serve custom pages itself. You might need a separate server (or a website hosted at another provider). But probably can't help you here with such a general request. You will need to read the documentation (hint : You are probably looking for the word 'failover' or 'backup') – Stefan Lasiewski Mar 01 '13 at 23:45
3

Twitter's 'fail whale' does not indicate any of these catastrophic failures, and they would not be trivial to create at all. Best I can think of is having a low DNS TTL for your domain, an extra internet connection with separate IP space that serves just the fail whale and some kind of monitoring tool that updates your A records upon failure.

Now unless you plan to have catastrophic failures very frequently, this would be overkill. And if you do plan to have them frequently, you're probably doing it wrong :)

Dennis Kaarsemaker
  • 19,277
  • 2
  • 44
  • 70
0

CDN's such as CloudFront or Akamai can substitute a fail-whale for error responses, or you can have a lightweight proxy layer on-premise that does the same thing. The local solution won't help you if your network connection dies, only a CDN or remotely hosted DNS server+healthchecker can handle that.

Jason Martin
  • 5,023
  • 17
  • 24