I have a cron-job(sh file) which greps(ps aux | grep) for certain process every 2 min.
It usually takes <5 secs to run.
But sometimes back, my server load went to 2400 and I had to manually restart my server to bring everything back to normal.
When I checked my logs, I found that there were hundreds of sh file running at same time and grep processes. Every 2 mins a new sh process was starting and earlier once were hanged in grep process, thus increasing the running count and load. I couldn't find out exactly what was the reason for grep taking this much time or getting hung.
But to avoid the same thing in future, I am thinking to put some capping limit, on the number of active sh processes or grep process that can run at a time on the server, like I can limit the number of php-sessions in php-fpm.
Is there any way I can do this on linux server?
or any better solution to prevent the same thing happen again?
P.S. : while adding tags,I just came across
'ulimit'
may be this is the thing, I am searching for, so added it as well. Now reading about it,if it could help.