I have noticed on some of my linux servers that a linux service will be hung. The only way I know that it is hung is operations that rely on the service fails and when I restart the service it fails to stop but it starts fine.
If I do service <servicename> status
it says its running, If I do a ps -ef | grep <servicename>
it only shows one process running for that service which is correct.
Anything else I can check to know if it is hung or not? I am trying to be proactive about bringing these service(s) back up and also determining why they are getting hung.
For reference the services are mostly openstack-nova-compute and openstack-cinder-volume. The cinder volume service I can detect with the rabbitMQ starting to build up but the same thing doesn't happen for nova-compute.
This is very hard to test because like I said the only way I know is if I try to do something on that node in OpenStack and it fails or gets hung, and then I restart the service. I have a script running to test some OpenStack services but with nova scheduler it might take a while for it to put a instance on that host, or the host may be full so it will never put another instance on that host.