I have two windows domain controllers.
10.10.10.10 Primary ( win 2008 r2 )
10.10.10.20 Replica ( win 2012 r2 )
The second one is configured as a replica of the first.
About once per week, the primary DC will negatively cache most .io
domains.
This makes it so noone in the company can access sites like:
chef.io
packer.io
yahoo.io
github.io
Strangely I can still access some .io pages, like the ones at github.io
The solution is to RDP into the DNS server and run dnscmd /clearcache
. That fixes the problem for 7 to 10 days.
Further symptoms
- Only affects the primary domain controller (the secondary, and other domain controllers can resolve these sites just fine)
- google dns servers also work
- Usually happens at about 11 am on wednesdays.
I'm not very familiar with windows, but here are the things I've tried
- Look at logs, I only see the following lines that look interesting
8:15AM
The DNS server wrote version 4638 of zone 254.10.in-addr.arpa to file 254.10.in-addr.arpa.dns..in-addr.arpa to file 254.10.in-addr.arpa.dns.
8:16AM
A more recent version, version 4639 of zone 254.10.in-addr.arpa was found at the DNS server at 10.254.40.51. Zone transfer is in progress.ic replication between domain controllers in a common domain or forest. By installing multiple domain controllers in a domain running DNS Server, you can ensure that DNS will continue to work when a domain co
- Verify there are no forward or reverse lookup zones for the .io domain
- Ensure there is nothing in the hosts file blocking the .io domain
- Compare the output of
ipconfig /displaydns
on all domain controllers
Is there anything else I can investigate to find out why the dns cache keeps getting corrupt so predictably? Is there a windows dns setting that can forcibly flush the cache when doing zone transer
Update
I've narrowed this down to the fact that I often switch from wired to wireless right before the Wednesday meeting. The wireless has 1 windows 2008 dns server and 1 windows 2012 dns server. When the 2008 server is selected as primary, the problem returns. The workaround is to run this dnscmd /clearcache
. Since the 2008 server is going away, I'm sure this problem will fix itself.