0

My file server just went belly up and I can't seem to figure out why. Perhaps I'm naive but when these things happen I can typically look through my System, Application or Security Event Viewer log and find the culprit - but no luck this time.

While I was out of the office I received an Icinga notification stating that the E:\ drive on my server was warning that it no information was available for that drive.

I logged onto the server to see the E:\ drive was there, but there was no "storage graph" located under it like there normally is, and clicking on it hung the OS. I then tried to reboot the server and the hanging continued. I issued a stop-computer server -Force command, which seemed like it was starting to work however the screen hung stating "Please wait for the System Event Notification Service". I had to do a hard shutdown on the server, which is never a good thing.

My question is if there are no diagnostics in the event viewer, is there anywhere I can go post-incident that can show me what caused the crash? I've never had a server lock up on me in the fashion this one did so I'd like to know what the root problem was.

DKNUCKLES
  • 4,028
  • 9
  • 47
  • 60

2 Answers2

5

FYI - for any VMWare guest, if you want to get a memory dump, you can take a snapshot, then use vmss2core.exe to extract the memory to a traditional windows memory dump file that can be read using windbg, and therefore MS support or other qualified people.

Converting a snapshot file to memory dump using the vmss2core tool (2003941)
http://kb.vmware.com/kb/2003941

You should remove the snapshot after the dump has been created, copied, and converted. This is usually preferrable to the environmental 1/0 switch if you actually want to investigate the current state of the system at the time of a hang. This is also simpler and less intrusive if you just want to get a memory dump of a running system without using the Windows keyboard sequence to force a blue screen, which would only work optimally if the desired memory dump were enabled and the keyboard sequence was enabled.

Greg Askew
  • 35,880
  • 5
  • 54
  • 82
  • Whoa That's awesome. – mfinni Aug 22 '13 at 18:41
  • Mind you, if he really has lost access to an in-use VMDK, trying to take a snapshot might not really be the best next step to take. – mfinni Aug 22 '13 at 18:44
  • @mfinni: Yeah, especially if they don't have the disk space ;-) – Greg Askew Aug 22 '13 at 18:52
  • dumb question, but I've no experience with memory dumps - can these be done after the incident or would they need to be done "during" the incident? – DKNUCKLES Aug 22 '13 at 19:00
  • 1
    During. A "memory dump" is literally that - it's writing the contents of memory out to a file for review. After an OS crash, the problems that led to it are no longer in memory. – mfinni Aug 22 '13 at 19:02
2

Without a memory.dmp (which wouldn't have been generated because you initiated the shutdown), I don't think there's anything definitive that you can do post-mortem. Unless you were running perfmon or similar and could find a metric pointing to a problem.

What is the E:\ drive on it?

mfinni
  • 36,144
  • 4
  • 53
  • 86
  • E:\ drive is the mail file storage area where users store documents – DKNUCKLES Aug 22 '13 at 18:10
  • Not "what do you use it for" ; what *is* it? single partition on a RAID volume, one of many partitions on a RAID volume, plain local disk, SAN LUN, etc? Could the *hardware* for that drive have failed or become unavailable in some way? – mfinni Aug 22 '13 at 18:12
  • Apologies for misinterpretation - It's a VDMK which is DAS to the VM host. The physical hardware is on a RAID 6 - the same RAID 6 houses the C:\ drive (though that is a different VDMK) as well as another VM. – DKNUCKLES Aug 22 '13 at 18:21
  • 1
    OK - that probably rules out the physical hardware, although there may have been a problem in the ESXi handling of the E:\ drive's underlying VMDK? Anything in the ESXi logs from that time? – mfinni Aug 22 '13 at 18:22
  • The only thing that is not "informational" is an error insufficient video RAM – DKNUCKLES Aug 22 '13 at 18:59