3

We're using PowerShell DSC to automate the deployment of a number of small self contained environments, in these environments we are deploying 2 domain controllers and use DSC to setup the domain etc. This is all working fine except for the fact that once deployed and running, at some point the sysvol replication between the two DC's stops working (or it never started working). We see this error in the log:

The DFS Replication service initialized SYSVOL at local path F:\SYSVOL\domain and is waiting to perform initial replication. The replicated folder will remain in the initial synchronization state until it has replicated with its partner . If the server was in the process of being promoted to a domain controller, the domain controller will not advertise and function as a domain controller until this issue is resolved. This can occur if the specified partner is also in the initial synchronization state, or if sharing violations are encountered on this server or the sync partner. If this event occurred during the migration of SYSVOL from File Replication service (FRS) to DFS Replication, changes will not replicate out until this issue is resolved. This can cause the SYSVOL folder on this server to become out of sync with other domain controllers.

Now I know how to fix this using ADSIEdit, that's not the issue. We're autoamting the deployment of these environments because we need to deploy lots of them and configure them identically, so I don't really want to have to go into each environment after deployment to fix this. We see this issue in every environment we deploy this way, so obviously something is amiss in how it's getting configured. So what I am really asking is if anyone has any ideas what could cause this, or where to start looking to try and find the root cause.

The AD deployment is pretty straight forward, we configure DC1 first, add some DNS entries, some group policy items, some user, groups and OU's, we then add in the second DC. The second DC does get all these objects, so the initial copy of the domain does work, but after that nothing in SYSVOL get's replicated.

Edit

We also see a single instance of the error below, ID 1202, at deployment time, which is odd given that DC prom succeeds and it is able to get teh inital copy of the domain;

The DFS Replication service failed to contact domain controller to access configuration information. Replication is stopped. The service will try again during the next configuration polling cycle, which will occur in 60 minutes. This event can be caused by TCP/IP connectivity, firewall, Active Directory Domain Services, or DNS issues.
Additional Information: Error: 1355 (The specified domain either does not exist or could not be contacted.)

SamErde
  • 3,409
  • 3
  • 24
  • 44
Sam Cogan
  • 38,736
  • 6
  • 78
  • 114
  • So, how is DNS configured? What other Event IDs are suspicious? Do you for example see event ID 2212 or 4012 in the logs? These are Server 2008 R2 machines? – duenni Apr 03 '17 at 09:26
  • Are you backing up those machines and shut them down at some time? – duenni Apr 03 '17 at 09:36
  • @duenni These are 2012 R2 machines, the machines are backed up but are not shut down. Both domain controllers are setup as DNS servers. I added some additional event data above, no instances of 2212 or 4012 that I can find though. – Sam Cogan Apr 03 '17 at 09:54
  • Any chance you are using the loopback as primary DNS on these machines? – duenni Apr 03 '17 at 10:37
  • No, these are actually running in Azure (not that this should make a difference) so they get their DNS server IPs by DHCP. When only the first DC exists, they are set to that machine only, once the second is added they get updated to both. – Sam Cogan Apr 03 '17 at 10:39
  • What about `dcdiag /c /v`. – duenni Apr 03 '17 at 10:46
  • So no failures on DC Diag, however I did discover that I was wrong on the answer on loopback. DC1 is using only 127.0.0.1 for DNS, DC2 is using 127.0.0.1 and the IP of DC1. I can look at changing this to use the actual IP's – Sam Cogan Apr 03 '17 at 11:51
  • Yep, kick out 127.0.0.1 entirely and let the DCs point to each other with their actual IPs. – duenni Apr 03 '17 at 12:06

2 Answers2

0

I think this is a DNS issue. You should not use 127.0.0.1 as primary DNS on these machines but instead use the real IP address and set the IP of the replica DC as secondary DNS server. This seems like the solution least people have problems with. This is an issue which is discussed over the years with various opinions, even Microsoft gives no clear answer, see this: link

Question

What is Microsoft’s best practice for where and how many DNS servers exist? What about for configuring DNS client settings on DC’s and members?

Answer

It depends on who you ask. We in MS have been arguing this amongst ourselves for 11 years now.

duenni
  • 2,959
  • 1
  • 23
  • 38
  • Thanks @duenni. I'm having some issues getting our automation to configure it in this manner. I'll award you the answer now as I won't get this resolved too quickly, but will update with results later. – Sam Cogan Apr 06 '17 at 13:43
  • Woo, nice. Thanks. I'd love to see some updates on this. – duenni Apr 06 '17 at 14:39
0

When the first domain controller is promoted, use its IP address (not loopback) as the primary DNS server, and put the loopback as its secondary DNS server.

When the second domain controller is promoted, you'll want their DNS client settings to look like this:

DC1

  • Primary DNS: DC2
  • Secondary DNS: Loopback

DC2

  • Primary DNS: DC1
  • Secondary DNS: Loopback
SamErde
  • 3,409
  • 3
  • 24
  • 44