0


I would like to ask for help. I have Windows Server 2016 with the DNS server installed. That server is a DC too. The server works like a recursive DNS server for the network and has DNSSEC validation enabled. This server has public IPv4 and public routable IPv6 address. Problem is, that DNSSEC validation takes incredibly long time. Most of the websites server can not resolve at first. Dig is returning to SERVFAIL and nslookup gives me:

     *** UnKnown can not find website.com: Server failed

When I'm trying to resolve that hostname and get to the website, it suddenly starts working. It takes approx. 5 minutes. After that, the website is reachable.

I think, that most likely DNSSEC validation takes too long time. When I'm looking into the servers cache, there are some records for that particular domain from the begining of the lookup, but not all. I think, that last RR Signature (RRSIG) appears there after really long time and when it's finally there, lookup is finished and I can view that website.

When I initiate DNS lookup, I can see this in the DNS cache: starting DNS lookup

When the translation is complete after few minutes, i can see this: DNS lookup finished

Could someone help me please? Any help would be appreciated.. Thank you.

EDIT:

I have problems especially with these websites: standardkonektivity.cz, dnssec.cz, nic.cz, mojeid.cz, turris.cz, jaknainternet.cz, domenovyprohlizec.cz, jaknainternet.cz which all are on few of these nameservers: a.ns.nic.cz, b.ns.nic.cz, c.ns.nic.cz, d.ns.nic.cz.

Issue appears on all installations of the Windows Server 2016. It looks like Windows Server 2016 issue. I have no problems with the same config on the Windows Server 2012 R2

I tried multiple internet connections, so it shouldn't be a fw/gw issue.

I have no problems with domains without DNSSEC

Problem persists when IPv6 is disabled.

Network configuration should be ok. I have tested this on multiple systems with different configurations

Clock is ok on the server.

It is DNSSEC issue. When i disable dnssec on the server, everything is ok. Strange is, that when I use +cd flag with dnssec enabled on the server, resolution fails too.

btw.. There is a time gap between making and posting those screenshots

dig +trace flag behaves strangely. Once it retrieved "dig: couldn't get address for 'a.ns.nic.cz': no more" and now it stopped working and it retrieves nothing.

devlin
  • 145
  • 2
  • 3
  • 14
  • Did you try temporarily to stop using the IPv6 address to narrow the problem? – Patrick Mevzek Jan 28 '18 at 15:36
  • Yes, now. To be sure... I have tried that and after a small fight with IPv6 root hints I found that it's almost the same. It seems that some of the websites are loaded immediately, but it could be only luck. There is still a problem with loading DNS records. I appended some screenshots. – devlin Jan 28 '18 at 18:19
  • What is the output of `ipconfig /all` on your DNS server? Are you using NTP? – sippybear Jan 30 '18 at 21:59
  • It seems unlikely to be DNSSEC based on performance data for verifies here: [https://www.ietf.org/proceedings/65/slides/dnsop-0.pdf] and signatures here [https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2012-R2-and-2012/dn593667(v=ws.11)]. Are the domains you're validating ones that you host on the same server? Separately, can you capture packets for these failing DNSSEC queries and compare them to the subsequent successful DNSSEC queries? – Slartibartfast Jan 31 '18 at 05:55
  • I can't open the links :( And no, I don't host any domains on that server. It's not authoritative server, only recursive with DNSSEC validation enabled. Ok, I'll try to capture those packets... – devlin Jan 31 '18 at 07:50
  • Do you have the problem with any DNSSEC enabled domain (in various TLDs) and in the opposite do all non DNSSEC enable domain work perfectly? You should try IPv4/IPv6 too and temporarily disable DNSSEC validation to make sure the problem comes really from there. Difficult to do a diff between your 2 images, it seems you get the NS records and the RRSIG on DNSKEY only after. Also the RRSIG inception dates seem after the date you posted your question. Are you sure your clock is OK on this server? BTW SERVFAIL is nominal answer when DNSSEC validation fails. – Patrick Mevzek Feb 04 '18 at 18:32
  • When you have the problem, can you try validating the same query with another recursive nameserver on your network, at the same moment? Try with `dig` for example with its `+trace` flag and try with and without the `+cd` flag to disable or not the DNSSEC validation. – Patrick Mevzek Feb 04 '18 at 18:34
  • Thank you for your response. I have edited my question and I put answers there... – devlin Feb 04 '18 at 22:34
  • If `dig +cd` fails too, then you do not have a problem just with DNSSEC as `+cd` disables DNSSEC handling. I suspect you have some elements on the network that drops some DNS packets, like big ones. Try to capture the DNS traffic as seen from the host when you do a query there, to see exactly what packets go out and what you receive back. Also try `dig +tcp` instead of just `dig` and see if the results are different. – Patrick Mevzek Feb 09 '18 at 05:56
  • I thought that too, but this happens in every instalation of Windows Server 2016 and I have tried that in many locations. And Windows Server 2012 R2 works well with the same config in the same network and Unbound DNS resolver with DNSSEC validation too. It could be a bug in WS2016 or I don't know what :-/ – devlin Feb 12 '18 at 07:20

0 Answers0