-1

Our server keeps crashing with a purple screen of death. I have tried interpreting the crash dumps but to no avail. I have an HP Proliant DL360 G7 with ESXi 6.0 running on it and I have other identical servers with the same software with no issues, but this one just keeps crashing about once a week with the following error.

Machine Check Exception: Fatal (unrecoverable) MCE on PCPU5 in world 32801:coalesceWorl System has encountered a Hardware Error - Please contact the hardware vendor

I have all of the crash dump logs if someone would be willing to go over them and help me figure out exactly what the cause of the problem is. I can't have this server in production when it keeps crashing all the time.

You can view the latest crash dump here. http://pastebin.com/JvziBPtA

Please any help would be greatly appreciated.

Kyle Vaughan
  • 41
  • 2
  • 10

1 Answers1

2

I suggest troubleshooting or following the guidance of the error message.

Things you can do:

  • You're running ESXi build #3620759 from March 2016. UPDATE YOUR ESXi INSTALLATION!!
  • HP servers have comprehensive diagnostic messages and logging.
  • Look at your ILO3 interface and open the HP IML log. That will tell you what is wrong.
  • If you're using the HP-specific ESXi install, look at your hardware status interface.
  • Run diagnostics from the HP Service Pack for ProLiant bootable DVD.
  • Call HP support.
ewwhite
  • 197,159
  • 92
  • 443
  • 809
  • I have tried a newer installation, it gives the PSOD even more frequently, rather than it occurring once a week or so, it happens every few hours. iLo contains no useful information. The Hp specific hardware status interface shows that everything is normal. Running the hardware diagnostic seems to be my only option but will be difficult as I am a few hours away from the server. – Kyle Vaughan Jan 16 '17 at 23:16