I have a CentOS 5 instance running on Amazon EC2. The normal CPU usage hovers around 10-20%. About 4 times in the past week, however, CPU usage has suddenly shot up to 100% and just stayed at a constant 100% until rebooting the instance.
I'm sure this is a bug or a misconfiguration with something on the server, but when the instance gets into this state, I can't log in via SSH to do any investigating. Unfortunately, Amazon doesn't provide a way for you to access the instance via a console.
So, I guess my question is -- is there a way to configure the machine such that in any 100% CPU situation, we give priority to SSH to allow root to log in and investigate?
Or at least, is there any easy way to automatically kill any process/processes when this sort of situation occurs?
By the way, this is a "C1.xlarge" instance on amazon, which means it has 8 cores.
Also if it helps, the machine is set up as a web server running Plesk. And don't tell me that Plesk can't be run within EC2, because I've been doing it just fine for months ... until recently. The machine is already running PLesk's version of monit, so I'd rather not set up a second monit.