I hope someone can help me out with this one.
I have a Windows Server 2016 VM running on hyper-v, the host is also Windows Server 2016. At random times connecting to network shares on other servers via name \\SERVER will fail, connecting with \\IPADDRESS always works.
The server is connected to a virtual switch with dedicated access to the hosts Broadcom NetXtreme Gigabit adapter.
This is the only server at this site and is connected to our main site through an IPSEC VPN.
The server functions as a domain controller, DHCP Server, DNS Server, and file server.
I first noticed the problem when AD replication was failing. Further investigation revealed I could not connect to the domain controller at our main site via SMB \\SERVER but I could connect with \\IP. Pinging the server by name works and DNS seems to be working.
Errors returned when connecting to \\SERVER are either "Windows can't find SERVER. Check spelling and try again" or "Network path not found"
I was able to connect to other servers by name at our main site.
While I was troubleshooting a few hours later DFSR started to fail to a different server at our main site. The error in the log "The remote procedure call failed and did not execute" I found the same symptoms when connecting to this server as I did with the first, the only difference was the communication issues started later.
I rebooted the VM and found everything was back to normal and working, AD was replicating nicely and DFRS was connected and working.
The next day I logged on to the server and found that everything had failed again and the same network issues.
Further troubleshooting revealed that disabling and re-enabling the Microsoft hyper-v network adapter restored connectivity, but the problem comes back at random times.
The logs on the server do not show anything out of the ordinary. Other than errors from AD replication, DFRS, and DNS. The DNS errors are:
The DNS server has encountered a critical error from the Active Directory. Check that the Active Directory is functioning properly.
The DNS server could not bind a User Datagram Protocol (UDP) socket to 172.18.0.10. The event data is the error code. Restart the DNS server or reboot your computer.
The DNS server could not open socket for address 172.18.0.10.
Verify that this is a valid IP address for the server computer. If it is NOT valid use the Interfaces dialog under Server Properties in the DNS Manager to remove it from the list of IP interfaces. Then stop and restart the DNS server. (If this was the only IP interface on this machine and the DNS server may not have started as a result of this error. In that case remove the DNS\Parameters\ ListenAddress value in the services section of the registry and restart.)
The DNS server could not bind a Transmission Control Protocol (TCP) socket to address 172.18.0.10. The event data is the error code. An IP address of 0.0.0.0 can indicate a valid "any address" configuration in which all configured IP addresses on the computer are available for use. Restart the DNS server or reboot the computer.
All errors in the logs disappear once the hyper-v network adapter is restarted. I assume these errors are cause by the connectivity issues.
I read somewhere that VMQ should be disabled on the hosts NIC, which I checked and it was. I have also tried uninstalling the Microsoft hyper-v network adapter and re-installing and on the host re-installing the network card drivers.
Does anyone have any idea what’s going on, it seems like a DNS issue but DNS is resolving names correctly with nslookup.
Any help would be greatly appreciated.