I find it pretty common for a Linux server to slow down to the point of complete unresponsiveness (LA 150+ etc), which when looking at it later using sar or munin or whatever it will show a sudden rapid increase in the number of processes. I generally need to reboot the machine at this point but it always leaves me wondering what caused the problem in the first place.
I'm assuming there is a rogue process going into some kind of loop creating loads of new processes, which then eat up the ram etc and cause the lockup. But how, after the event, can I determine which is the offending application/ process?
Thanks