1

As the title says my Ubuntu 14.04 server becomes unresponsive after a period of being idle. It is going to be an NGNX proxy box, but not taking production traffic yet for obvious reasons. It ran for about a month without issue, while I was waiting for the network gear to be upgraded before sending traffic to it.

But then a couple weeks it started becoming unresponsive and I had to restart the box via IPMI (I don't have physical access to it). After reboot I investigated the logs and noticed several "HANDLING MCE MEMORY ERROR's" in the kern logs. This process kept repeating for several days. I had one of the server guys replace the DIMMS and that error went away, but the original problem still remained.

Next, I ran MEMTEST for about 60 hours without error. Then stressed test the CPU for 24 hours with MPRIME, during that test the server stayed up the entire time and had no errors.

So it appears the Memory and CPU are functioning correctly, but when the machine goes idle for some time it becomes unresponsive and I have to reboot it. I don't think its a power setting issue, because it stayed up for about a month prior to this.

Any ideas?

EDIT: Ended up not being able to figure this issue out, so just stuck the HDD's in an identical system.

0 Answers0