0

For the last couple of weeks my server have been crashing like once a week and usually during the weekends when nobody is using the Server in the Office and before that, my server was very, very stable. I've been trying to find the source but I can't find it.

Cheeking the Event viewer I always get the same errors again and again which causes the services in the server to stop working until a reboot is necessary because almost nothing works

I'm always getting the following errors in the System section

  • Event ID 2019 (srv)
  • Event ID 1059 (DHCP-Server)
  • Event ID 1054 (Group Policy)
  • Event ID 1058 (GroupPolicy)
  • Event ID 7031 (Service Control Manager Eventlog Provider)
  • Event ID 7000 (Service Control Manager Eventlog Provider)
  • Event ID 10000 (DistributedCOM)
  • Event ID 5719 (NETLOGON)

** And some others a few times that are related to those that happen when the server almost dead.

Looking at them in general it looks like there is some sort of problem that don't let the DCHP, DNS, GroupPolicy and other services contact the DC. The first three appear again and again every few minutes once it start not working.

The first Event ID when the Server starts not work is not always the same but is one of the top three.

I tried a sfc /scannow today as there is a registry related error in the log the day before this problems started to happen. The is Event ID 5:

{Registry Hive Recovered} Registry hive (file): '\SystemRoot\System32\Config\RegBack\SOFTWARE' was corrupted and it has been recovered. Some data might have been lost.

Although that event ID have happened only this time.

I installed WSUS recently and just in case I have removed it but I don't think is the cause of the problem as it was working fine with my test computers.

Anyway.... any idea what can I try to do in order to find the source of the problems?

Santi
  • 149
  • 4
  • 13
  • What is a crash? Blue screen? Random reboot? Hard locking? – DanBig Nov 04 '13 at 16:09
  • Is this server the Domain Controller? Is it also the DNS server? How are the DNS client settings configured on this server? – joeqwerty Nov 04 '13 at 16:10
  • Only once the server rebooted itself, the rest of the time it was running but none of the services were working so we had to reboot it. The server is a child DC, DNS, DCHP and DFS Server. – Santi Nov 04 '13 at 16:16
  • What do you mean it is a child DC? – joeqwerty Nov 04 '13 at 16:22
  • I have another DC lets call it mydomain.com and the server with problems is child.mydomain.com – Santi Nov 04 '13 at 16:41
  • @Santi Is this the only domain controller in the `child.mydomain.com` domain? – Shane Madden Nov 04 '13 at 17:14
  • Yes, this is the only DC in child.mydomain.com and there is also only one DC in the mydomain.com – Santi Nov 05 '13 at 08:40
  • The problem was a corrupt entry in the register, after fixing it, the server started to work fine again – Santi Jan 10 '14 at 11:35

2 Answers2

0

I manage to fix it after a couple of days cheeking everything... It was a registry problem but not the entry described by Sing-Cheong Chen. After using tools to check and fix the registry the server stopped crashing.

I don't remember the key entry but I think it was one related to MySQL or Backup Exec.

Thank you for the suggestions anyway.

Santi
  • 149
  • 4
  • 13
-3

This is a known bug in MS Windows when using static IP, gigabit NIC

http://support.microsoft.com/kb/938449

Most people use following option to create new registry value

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\TcpIp\Parameters\ArpRetryCount Value: 0

Which will disable ARP retry

Detail steps:

  1. Click Start, type regedit in the Start Search box, and then press ENTER
  2. Locate the following registry key:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters

  1. On the Edit menu, point to New, and then click DWORD Value
  2. Type ArpRetryCount
  3. Right-click the ArpRetryCount registry entry, and then click Modify
  4. In the Value data box, type 0 and then click OK.
  5. Reboot Windows

Sing-Cheong

  • 1
    Other than the Event ID, that KB article appears to have absolutely nothing to do with the problem as described. – Chris McKeown Jan 10 '14 at 23:54
  • It is also a bullshit answer in general. I tell you, there are millions of windows servers with static IP and gigabit NIC's that do not crash. – TomTom Jan 13 '14 at 08:59