-2

I was running simulations on a virtual machine of Ubuntu on Windows 7 OS and nothing was wrong. I decided to wipe Windows 7 off, and install Linux (Ubuntu 14.04). I started getting this error after running a simulation for some time:

 Node0: DRAM uncorrectable ECC Error

 Node1: HT Link SYNC Error

I just ran Memtest86 v4.20 on all of the 32 GB of memory, and it passed 100%. Why then would I see an error? Of course, when Memtest runs, it says "ECC off."

Has anyone fixed this issue?

1 Answers1

1

The error message is almost certainly due to faulty hardware. It is quite understandable that Ubuntu would not report the error while it was running inside a virtual machine rather than on the real hardware. It is possible you have been having this problem for a while without realizing. I don't know if Windows would have logged this kind of event or where it would be logged.

Passing in memtest doesn't prove the hardware is good. Faulty hardware may sometimes work and sometimes fail. It may be a coincidence you saw the error while running your simulation and not in memtest, it could as well have been the other way around.

It is possible that your simulation somehow stress the hardware more than memtest does, so even if the simulation fail each time and memtest pass each time, the root cause may still be faulty hardware.

kasperd
  • 30,455
  • 17
  • 76
  • 124