4

I'm considering options for changing a 'high availability' website that provides a service via an https api.

The current setup is:

  • Two self-contained VMs, from different cloud providers (AWS and RackSpace)
  • One DNS load-balancer: this is where the HA also comes in, the service monitors the two VMs and if one appears to be unavailable it directs all DNS queries to the other

If load-balancing was not a requirement, could we do without the load-balancer by simply co-locating the DNS servers on the two machines, each replying only with it's own address when queried by DNS. In this scenario, if one VM is down that will remove both the service and the DNS server that points at the service so no clients will be directed to the server that is down, is that correct?


edit for clarity:

we are happy with the less-than-perfect 'HA' we currently have, this question is specifically about whether the changes I'm thinking of will make things worse or not.

  • 1
    In theory this could work, but DNS caching would ruin it. – Nathan C Jul 04 '14 at 14:12
  • @Nathan DNS caching is an issue with the current setup, are you saying it would be worse? –  Jul 04 '14 at 14:12
  • What Nathan said; DNS load-balancing is essentially completely unusable on the modern internet. – MadHatter Jul 04 '14 at 14:12
  • @MadHatter is there an obvious alternative that doesn't involve IP sharing between locations (which is out of our budget range)? –  Jul 04 '14 at 14:13
  • I'm not sure, I can't quite tell what this "*DNS load balancer*" is. Are you happy to have all requests go initially to a single machine (or a pair of machines in failover) at a given location, then be fanned out to distributed servers from there? – MadHatter Jul 04 '14 at 14:15
  • @MadHatter that's not the current set-up (which distributes the load) but it is what I mean by "If load-balancing was not a requirement", yes –  Jul 04 '14 at 14:19
  • oh wait, sorry @MadHatter, no there can't be a single point of failure –  Jul 04 '14 at 14:20
  • 1
    Then no, there's no way to do this in a geographically-distributed way without spending real money. High availability costs; I used to tell customers that *you get 99% uptime for the cost of the basic server setup, but each successive 9 you want multiplies the budget by between two and ten times*. If you have a real business need for that kind of high-availability, you should have budget to match, because you'll need it. – MadHatter Jul 04 '14 at 14:22
  • @MadHatter I've tried to clarify the question: we are happy with the less-than-perfect nature of DNS load-balancing - the current setup is adequate for the service. But I don't want to make it worse and I'm asking if the change I'm suggesting would do that. –  Jul 04 '14 at 14:26

3 Answers3

8

The direct answer to your question is Yes, it will make it worse.

This is because one of your name servers not responding will cause resolve delays all the time for clients who attempt to resolve via the failed name server, whereas the current technique will only fail +- half the clients until you detect the VM is down + TTL seconds.

Generally name servers are cached for 48 hours so during the lower of your downtime or name server updates + 48 hours, your users will have a randomly slow experience.

Your current implementation is better unless your VM down detection is slow. For the period between the VM going down and you detecting it + TTL the proposed solution will actually be better. But I am assuming that that is so small a period of time as to be ignorable.

Ian Macintosh
  • 955
  • 1
  • 6
  • 12
  • 1
    Sorry, Ian, but would you care to expand on some of those assertions? Re para 2, ISPs are well-known to ignore TTLs on A records; I've seen **six week** delays in honouring a new A record, in the worst case. Re para 3, I'm not aware of any reason why an NS record would be magically cached for 48 hours regardless of its TTL. And the OP isn't proposing to change the NS records at any time, so the TTL of same strikes me as irrelevant. – MadHatter Jul 04 '14 at 15:08
  • 1
    Sure @MadHatter. The authoritative NS come from the root servers for the domain, and the root servers almost universally cache NS records for 172,800 seconds (48 hours). Which means that when somebody queries from an empty cache situation, they will query X.root-servers.net for example.com. The root servers will point them to X.gtld-servers.net (also TTL 172,800), who in turn will hand out the authoritative nameservers for example.com with TTL 172,800. These are the NS records I was speaking of in para 2, not the A records. The A records TTL is under the control of Jack Daniels. – Ian Macintosh Jul 04 '14 at 15:25
  • I just did a `dig` on my own domain's nameservers, recursing properly off the root, and got back the TTLs I set myself in my own zonefile (3600s), so I don't think you're right about that. I agree that the A record TTL is under the OP's control, I just don't think most large ISPs pay any attention to what end-users set their TTLs to, any more. – MadHatter Jul 04 '14 at 15:28
  • ^^ ran out of space :-) The delay in updates is often just terrible performance by the registrar ISP who simply don't update the DNS records, and continue to hand out the old data for weeks until they finally commit the changes. Once the changes have been committed, they take max 172800 seconds to appear. Naturally during that time some will already have the new records with the final unfortunate sucker who cached the old data a fraction of a second before they were updated, and then held onto that for 48 hours. – Ian Macintosh Jul 04 '14 at 15:29
  • do this: "dig ns example.com @a.root-servers.net". You'll get a response (TTL 172800) telling you to go query a.gtld-servers.net. So now "dig ns example.com @a.gtld-servers.net". Again, they will reply (again TTL 172800) that the NS are a.iana-servers.net & b.iana-servers.net. Hence if you were a program you would cache for 48 hours the data pointing to the authoritative nameservers for example.com. If you were Jack Daniels above, with a failed VM (and NS assuming he implemented his proposal), approx half the NS queries will take inordinately long because they tried the failed NS first. – Ian Macintosh Jul 04 '14 at 15:34
  • 1
    Fair enough, point taken and thank you for that. Nevertheless I repeat my comment above that **the OP's plan doesn't involve changing the NS records at any time**, so the TTL is immaterial. – MadHatter Jul 04 '14 at 15:37
  • 1
    As for the client delay, I just sat on one of my nameservers and watched a recursing query go through where one of the resolved party's nameservers didn't respond, and my BIND tried a second one after 800ms; I don't think that's an intolerable client delay. – MadHatter Jul 04 '14 at 15:43
  • http://www.sigcomm.org/node/3259 - Not all DNS resolvers are born equal. Some very sucky/popular systems will actually do much *much* worse than that, and then degrade even further :-) – Ian Macintosh Jul 04 '14 at 16:21
  • 1
    Ian, may I congratulate you on doing an excellent and well-referenced job of arguing your corner? I am unpersuaded at this time, but you've given me a lot of food for thought, and I've learnt something - and for both of those, many thanks! I shall go read your other answers, and upvote them if they're as good as this one. – MadHatter Jul 04 '14 at 21:08
  • I've read and re-read this and all the comments, and I think I now understand you. You know that my plan never involves changing the NS records, but you are cautioning that this will mean a "randomly slow experience" for (half of?) client DNS resolution during the downtime of one VM. Your mention NS updates is presumably referring to the theoretical case when we manually update NS records during extended downtime to mitigate this? During downtime the current setup doesn't affect new connections after the VM is detected as down, ... –  Jul 05 '14 at 06:48
  • ... whereas my proposed setup would mean there is a negative impact for the duration of the downtime. This isn't acceptable and is exactly what I wanted to find out in advance by asking here, so thank you very much. Please tell me if if missed the point in though! –  Jul 05 '14 at 06:49
  • I took that you *were* suggesting to change the NS records in your proposal where each server would become a NS holding an A record for itself alone. Assuming the server fails totally then the NS and the https site would go offline simultaneously. Because NS takes 48+ hours to change it would be sub-optimal till you bring it back up again, or till you change the NS to some new working location. I do understand that you currently update the A records, and I am saying that this current A record method is better than the proposed "each server is it's own NS" method. – Ian Macintosh Jul 05 '14 at 22:00
  • I further assume that your current method uses an appropriately short A record TTL so that they expire quickly after you detect a failed server. If not, doing so will have a very positive effect on it's operation. YMMV but possibly something like a 180 to 300 second TTL would work well. eg, see Google's MX record TTL's which are typically 300 seconds. – Ian Macintosh Jul 05 '14 at 22:08
4

If I understand you correctly, you propose to have both distributed servers be the listed nameservers for the domain in question, each with an authoritative zonefile that contains a single A record pointing to the local server for the hostname on which your HTTPS service runs.

If that's correct, then yes, I would expect this to work based on the round-robin / stochastic nature of requests to nameservers. If one server goes down, it won't be able to respond to queries with its own address, so clients should fairly quickly fail over to the other for the DNS lookup.

You say you understand and accept that DNS cacheing means that a downed server may break clients who have cached the lookup that points to that server, possibly for lengthy periods of time because of ISPs that don't honour short TTLs. If that's all so, then I don't see any obvious holes in your proposal.

I just wouldn't do it myself in a month of Sundays.

MadHatter
  • 79,770
  • 20
  • 184
  • 232
  • The new scheme breaks because of caching of where the name server is, and the old scheme breaks because of caching of where the web server is. Are these failure modes identical or (for example) are ISPs more likely to ignore TTLs on nameserver records? –  Jul 04 '14 at 14:52
  • I don't see why the new scheme would break that way. DNS is designed to failover quickly to a second nameserver for a zone if the first is unresponsive. – MadHatter Jul 04 '14 at 14:55
2

DNS load balancing used to be the "thing" to use, but with caching in common use today it's not as practical.

In Windows, you have the "Network Load Balancer" which can assist with what you want to do by using multicast over a private network. You can still do this across providers with a VPN between the two.

In Linux, you'd need to use something like haproxy or other package to accomplish load balancing. This produces challenges on their own, though.

Look at your options and decide which would be most cost effective for your needs. I can't speculate beyond the very popular options.

Nathan C
  • 15,059
  • 4
  • 43
  • 62
  • Thanks, I will look into haproxy. Just to note though, we are using DNS load balancing presently and it is fine for our needs - the question is really just about whether I'd be making things worse (in terms of HA, not load-balancing) with the change I'm suggesting –  Jul 04 '14 at 14:21