We are currently using the latest daemontools (http://cr.yp.to/daemontools.html) to manage our background linux (AWS-linux) application servers. Our application servers run in JVMs:
[ec2-user@ip-10-0-1-220 local]$ java -version
java version "1.7.0_75"
OpenJDK Runtime Environment (amzn-2.5.4.0.53.amzn1-x86_64 u75-b13)
OpenJDK 64-Bit Server VM (build 24.75-b04, mixed mode)
Everything works well and as expected unless we restart the server:
sudo shutdown -r now
When the server restarts the configured daemontools services start and run ok for ~10-20 minutes. After this period however threads within the application servers begin to hang until the entire process is frozen. The only way we have currently found to fix the problem is to recreate the service directory, under /service/...
The symptoms may appear to indiciate corrupted data in the /service/.../supervise/
directory. This issue does not appear to have been discussed before.
Any suggestions or advice on how we can restart our servers without this problem would be greatly appreciated.