2

We have a few DELL 1950 servers. One of those servers has CentOS6.3 and it's randomly rebooting, so I suspected it was hardware (no log generated). The other four servers do not randomly reboot.

We ran memtest86+ on the five servers and on three of them memtest86+ crashes (displaying an odd and colorful screen, like if a video card failed).

I tested an old memtest86 (not +) and none of the servers crashed. I also tested other RAM testing utilities, with no tool failing.

Have any of you guys experience this?

Chetan Bhargava
  • 245
  • 5
  • 15
user148723
  • 21
  • 3

3 Answers3

3

If memtest crashes, there is a high chance that your memory is bad. Try to replace memory from non-crashing servers and re-running memtest. Most likely the memory is the culprit. You can also reduce the memory to half (system permitting; minimum memory requirements) and try running memtest. Once passes, try replacing memory with other half and see.

Chetan Bhargava
  • 245
  • 5
  • 15
1

I you have a Linux server that is rebooting, this usually means that it is a hardware problem. Check the logs in Dell OMSA (Dell Open Manager System Administrator Managed Node). or via DRAC (Dell Remote Access Card).

Contact Dell technical support to assist you in the problem investigation.

Mircea Vutcovici
  • 17,619
  • 4
  • 56
  • 83
1

Another tool to keep on hand that is exttremly useful to help test and diagnose is the UBCD. it include the memtest and memtest+. The new version even has a memory tester for GPUs. This will come in handy if you are suspecting the video having issues also.

Oxymoron
  • 340
  • 3
  • 12