0

We have a Windows 2003 Server that crashes relatively regularly (about once every week). At the time the server crashes, there isn't any information in the event logs. Not in Application, not in Security, and not in System. The monitoring system (Nagios) does not report any abnormal conditions on the machine near the time of crash. And when I say crash I mean the server turns off, not merely a blue screen.

I have made some guesses about the cause of the crashes, but so far nothing has proved to be correct. Does anyone know where I can find further information (e.g. logs) or provide any helpful guesses as to the cause?

rpkelly
  • 121
  • 1
  • 4
  • Any more patterns? Like same day every week, rhoughtly same time etc.? Otherwise I would look for a hardware replacement - as in: the hardware is done. – TomTom Sep 29 '10 at 05:20
  • No, seemingly no pattern. Might go for a few weeks, then crash in the middle of the night; might crash twice in one week, in the middle of the day. – rpkelly Oct 01 '10 at 16:19

2 Answers2

1

How is the power situation where the server is located?

At a former job site, the building would test the backup generators regularly, which might cause systems to power off if not hooked into a working UPS. Granted, the building management did tell everyone that this was a standard policy.

Is there any pattern when the servers crash? Or is it random throughout the week?

I've also encountered sites where the power lines themselves into the server closet were poor, causing the UPS devices to kick in every few minutes due to voltage drops or spikes.

Celebros
  • 11
  • 1
  • If it was power related, I think the other servers we have of the same model would also go down, but this isn't the case. And all those machines are connected to the same UPS. When the power does go out, we do have other machines not attached to the UPS which will go down, so I don't think it is a "global" power issue. – rpkelly Sep 30 '10 at 13:48
0

If there is no blue screen, the OS wasn't in control when the server went down. I would think that means you won't find any logs either. To me it sounds like a hardware issue, maybe heat or power related?

DougN
  • 670
  • 2
  • 7
  • 16