0

Does anyone know of a service that will run a DNS heartbeat check then turn on an ec2 machine and forward all requests to that machine if the original(non-ec2) hosted solution goes down?

If not is there tool I could run on the EC2 install to do something similar?

Thanks.

Shane
  • 115
  • 3
  • Is this for your business use internally or for customers on the Internet? Because if you're looking for "server A is dead, move entirely to server B" I don't know how this would work, since it could move to server B if *your* connection dies, not your client's, and you could end up with split brain. If the app is designed to support it you probably would just want to use a form of load balancing or geographic balancing. If it's internal that's a different story. Automating this could be asking for headaches. – Bart Silverstrim Aug 05 '10 at 11:47
  • Plus I think Amazon does do something like geographic balancing, but I'm not sure. You might want to research how reliable their service is before overcompensating for availability of your hosted server, then instead make sure you have a good recovery plan (backups, restoration) if something happens instead of maintaining a heartbeat/fencing solution, depending on how mission critical this application is for you and your business. – Bart Silverstrim Aug 05 '10 at 11:49
  • You might end up introducing a different point of failure if you accidentally automate a switchover to another server when there's not really an availability issue or you'll have some customers who can't access the service because you side went down when the EC2 side is working fine. Defeats the purpose :-) – Bart Silverstrim Aug 05 '10 at 11:50
  • So what would the point of a DNS heartbeat check be in general, if I can't rely on the heartbeat check to represent the clients connection? Apologies for the ignorance but if not heartbeat check how do we reliably detect a site is down and take action? Load balancing is more costly/time-consuming solution(would need to sync data in real-time) which isn't really needed, they just need to be up. – Shane Aug 05 '10 at 12:05
  • You have to consider what it means for the site to be down; is it your connection to it, or is it the actual system that's dead? What happens if you automatically cut over to another server if it was actually a glitch in your router or upstream provider? And it depends on the application. What are the ramifications of having a split brain event? Can you resync data properly? Is this overmanagement for your needs to have this in place instead of a good backup to bring it back up in a timely manner? – Bart Silverstrim Aug 05 '10 at 12:24
  • If the app is for internal use and you're just hosting it out there, then it's a little different than if you have users in PA able to reach your application but not from Nevada, and your system cuts over because of some transient glitch and in the process leaving two now un-synced servers running independently. And if you're doing it with DNS, you're going to have an update delay as DNS propagates, by which time you might have it already fixed (and then you create two waves of potential outages and/or connection glitches). – Bart Silverstrim Aug 05 '10 at 12:27
  • Someone else out there might have better insight on managing something like this, but it seems that having multiple geographically spread servers running with balancing would be more reliable to me. But that depends on whether the complicated setup is worth the hassle, how business critical it is, and your available resources. – Bart Silverstrim Aug 05 '10 at 12:29
  • Cool, I understand that DNS might be an issue when a specific location goes down and another doesn't. But if the DNS failover solution verified that the site was down from 2+ locations(http://www.dnsmadeeasy.com/enterprisedns/dnsfailover.html) would this be less of an issue. Could you not assume that the failover server was then more available than the first? Or even get an outage count from several locations and whichever has less outages failover to. Two servers unsyncing isn't actually a big deal. – Shane Aug 05 '10 at 12:57
  • Could you not just create a load-balanced solution within AWS? – daemonofchaos Sep 30 '11 at 17:30

1 Answers1

0

This is really more of a programming question than a server admin question. You have an idea of the sort of failover you want to do, but it most likely will require a fair but of custom coding to achieve your end result.

Short answer: You can usually get anything to do anything that you want, it just depends how much time you're willing to invest in it. For this, it would be roll your own.

Aaron
  • 2,968
  • 1
  • 23
  • 36