2

I have a run a fairly busy site for 2 years with this setup for the web server:

  • ec2/Amazon Linux AMI 2014.03.2 on a micro instance
  • lighttpd (latest)
  • godaddy dns

This morning, the site was down - DNS failure. I was able to access the site via the IP address and the ec2 public address, but not the domain name. I've had DNS outages from godaddy once or twice before, but this one was weird:

  • pingdom picked up on dns problems an hour earlier, but thought we were up.
  • downforeveryoneorjustme.com thought we were up
  • browserstack thought we were down.

I guess that's not too weird: I understand DNS problems aren't always global. Here's the weird part: restarting lighttpd fixed the problem.

I want to take action to help remedy the situation, but I'm so unclear about this problem, I don't even know next steps, or what to search for. How could a server reboot affect a DNS problem?

Updated The timing was so perfect between reboot and the problem being fixed, I didn't anticipate people assuming coincidence. But in fact the problem returned roughly an hour later, and I fixed it again with a lighttpd bounce. I know this sounds crazy!!

  • 3
    You restarted lighttpd and the problems ended at the same time. That does not necessarily mean that one caused the other! Especially since there is no relation between them. – Michael Hampton Aug 30 '14 at 14:27
  • Updated: Not anticipating your comment, I neglected to mention I solved this problem twice with a reboot. – Baconserker Aug 30 '14 at 19:03

2 Answers2

0

For me it sounds like a problem by the DNS provider. Maybe there was a wrong DNS entry in some routers or DNS server. And they updated in the same moment, as you restarted lighttpd... Or you misinterpreted the error message, but I don't think so ;)

Falcon Momot
  • 25,244
  • 15
  • 63
  • 92
sedrubal
  • 23
  • 7
0

DNS problems are hard to figure out because of cache mechanism. If the corresponding RR (ressource record, see RFC1035 section 3.2) was cached somewhere, it is still valid until it expire, even if the authoritative server for this RR fails.

The restart of the web server first made me think of a virtual host problem, but you can access the website by using the IP.

The next steps for diagnosis should use DNS specific tools (e.g. dig with the +trace option) and usual routing diagnosis tools (ping, traceroute)

Manu H
  • 158
  • 6