I have a ubuntu server that I suspect failed due to overheating. I'm doing a post-mortem and I'm not exactly sure what to look for to confirm my hunch.
Any thoughts on what logged information would indicate a failure from overheating?
I have a ubuntu server that I suspect failed due to overheating. I'm doing a post-mortem and I'm not exactly sure what to look for to confirm my hunch.
Any thoughts on what logged information would indicate a failure from overheating?
If you did not have sensord
from the lm-sensors
package running, you probably would never know for sure. Maybe you could try looking at side-channel data like SMART attribute logging done by smartd
(smartmontools) - it logs attribute changes to syslog and may contain disk temperature - which in turn would allow an educated guess about the system's temperature.