0

We have a new Dell server with a pair of 6-core Xeon 2620s for a total of 12 cores. Using Speedfan/HW Monitor, I notice that the cores associated with one of the CPUs run about 10 degrees hotter under load than the other CPU (mid 60s centigrade vs mid-50s). At idle, the numbers are a little closer, but the hottest core idles at around 45 and the coolest idles at about 35 or so, but the two groups of traces are not as distinct as under load, where clearly there are two groups of temperatures.

It's been a while since I built and ran my own servers, but if this was a home computer I'd built myself, I'd probably pop the heat sink off and reaplly some Arctic Silver or something, make sure all the fasteners are nice and tight, that kind of thing.

Note the max temp for these chips is 77 degrees so we're not in danger or anything, but I'm wondering if this is something I should address.

Thanks.

  • Well, that is a Dell ;) Call support, have them send a technician and - reseat the CPU's with proper cooling ;) – TomTom Jul 08 '13 at 16:52
  • 1
    What does OMSA Say? Are there any warnings/errors such as a failed fain? – Zypher Jul 08 '13 at 16:55

3 Answers3

3

My guess is that it's related to the processor affinity of the processes and applications that are running. If more processes/applications are running on one CPU than the other then it's going to naturally run hotter. If the temps of both CPU's are within "safe" ranges then this isn't something I'd be particularly worried about.

joeqwerty
  • 109,901
  • 6
  • 81
  • 172
3

It's quite possible that this is just completely normal for that model. In some case designs one CPU gets a bit more airflow than the other - but they both get at least enough to keep them cool, which seems to be the case. With 12 cores, it's also possible one set is being used more heavily than the other, especially when the server isn't running flat out - this is a good thing, it lets one entire CPU be put into low power mode when it's not needed to save electricity.

Other things that would affect it would be one CPU being closer to things that are hotter - hard drives, power supply, RAID controller chips, RAM, etc.

If this is new enough that you're just setting it up (not in production yet), I'd run the onboard diagnostics from the boot menu and make sure all the fans are working normally. If your server comes with iDRAC you can also connect to the BCM and check the status there. Or install the actual Dell management software to check.

It's probably not an issue - personally, I would run a burn in test for a day or two and see just how hot it gets. If it never gets anywhere near the max for the CPU, I wouldn't worry about it.

Grant
  • 17,859
  • 14
  • 72
  • 103
  • Thanks. Actually I am running Folding at Home using all 12 cores and as far as I can tell, it's using all the CPU resources equally and keeping all 24 virtual cores pegged at 100%. Because the six that are hotter are all associated with one CPU, I'm thinking it's a fan/airflow/heatsink seating thing. Thanks. – TerminalDilettante Jul 08 '13 at 17:22
  • If it's within support, I'd call Dell and run Grant's theory past them; either they'll agree or they'll be straight out to fix a fault and either way the issue will be off your list of worries for the day. – Rob Moir Jul 08 '13 at 19:36
2

This kind of thing is quite normal. Airflow, the arrangement of heat piping, and imperfectly balanced load can cause this. Even different types of loads can cause different power usage and thus heat dissipation between different processors. Also, it's totally inconsequential as all your processors should be running well below their operational maximum temperatures anyway.

Annoy your vendor about it if the hot one is actually getting close to dangerous temperatures, or if you see a dead fan or something. Otherwise, ignore it.

Falcon Momot
  • 25,244
  • 15
  • 63
  • 92