I got a server from UbiquityServers about a week ago, I installed a simple Apache server and it just serves up images. The server is under very little load because it simply is an origin server behind Amazon's CloudFront but yesterday it suddenly became unresponsive to SSH to the point that I had power it off/on to SSH back in. I'm trying to find what caused this & I would appreciate any input from the community.
Here are some findings.
I noticed there was a spike in received multicast packets right around the time, here is a log:
sar -n DEV -f sa29 | less
08:30:01 PM eth1 66.96 63.34 19.54 62.51 0.00 0.00 0.05
08:40:01 PM lo 0.07 0.07 0.01 0.01 0.00 0.00 0.00
08:40:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
08:40:01 PM eth1 65.05 70.51 5.63 84.70 0.00 0.00 0.02
08:50:01 PM lo 0.04 0.04 0.00 0.00 0.00 0.00 0.00
08:50:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
08:50:01 PM eth1 57.84 59.48 6.71 67.85 0.00 0.00 0.04
09:00:01 PM lo 0.03 0.03 0.00 0.00 0.00 0.00 0.00
09:00:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09:00:01 PM eth1 48.55 47.35 4.30 53.78 0.00 0.00 0.03
09:10:01 PM lo 0.01 0.01 0.00 0.00 0.00 0.00 0.00
09:10:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09:10:01 PM eth1 53.16 51.88 5.61 58.48 0.00 0.00 0.02
09:20:01 PM lo 0.04 0.04 0.00 0.00 0.00 0.00 0.00
09:20:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09:20:01 PM eth1 61.80 63.91 7.75 73.46 0.00 0.00 0.05
09:30:01 PM lo 0.03 0.03 0.00 0.00 0.00 0.00 0.00
09:30:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09:30:01 PM eth1 54.74 55.70 5.79 63.43 0.00 0.00 0.02
09:40:01 PM lo 0.01 0.01 0.00 0.00 0.00 0.00 0.00
09:40:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09:40:01 PM eth1 27.83 28.57 3.17 32.59 0.00 0.00 1058754721.47
09:50:01 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09:50:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09:50:01 PM eth1 0.00 0.00 0.00 0.00 0.00 0.00 2142789576.69
10:00:01 PM lo 0.05 0.05 0.01 0.01 0.00 0.00 0.00
10:00:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
10:00:01 PM eth1 0.00 0.00 0.00 0.00 0.00 0.00 2152346090.50
10:10:01 PM lo 0.01 0.01 0.00 0.00 0.00 0.00 0.00
10:10:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
10:10:01 PM eth1 0.00 0.00 0.00 0.00 0.00 0.00 2142038999.87
10:20:01 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00
10:20:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
10:20:01 PM eth1 0.00 0.00 0.00 0.00 0.00 0.00 2153457524.69
10:30:01 PM lo 0.01 0.01 0.00 0.00 0.00 0.00 0.00
10:30:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
10:30:01 PM eth1 0.00 0.00 0.00 0.00 0.00 0.00 2142646569.12
Average: lo 0.03 0.03 0.00 0.00 0.00 0.00 0.00
Average: eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: eth1 91.61 90.43 21.05 59.33 0.00 0.00 87333330.59
10:42:20 PM LINUX RESTART
10:50:01 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s
11:00:01 PM lo 0.03 0.03 0.00 0.00 0.00 0.00 0.00
11:00:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
11:00:01 PM eth1 31.57 28.14 2.54 30.25 0.00 0.00 0.05
11:10:01 PM lo 0.11 0.11 0.01 0.01 0.00 0.00 0.00
11:10:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Server is using CentOS 6. I'm not quite sure as to what else I should be checking.