I have an Ubuntu 10.04 LTS box setup as a Chef server. This was all working fine until the first time the box was rebooted, after which the following three (possibly unrelated) things happened:
- chef-client attempted to install updates via. apt, which failed
- The Chef webui stopped working (connection refused/timeout)
- CouchDB and the xulrunner library it depends on stopped responding to commands - running
service couchdb stop/start/status
orxulrunner -v
simply hang - nothing is output or added to any logs
I believe the update problem was caused by this bug: https://bugs.launchpad.net/ubuntu/+source/xulrunner-1.9.2/+bug/680570, where updating xulrunner causes a hang. I was able to get around this by restoring the box from an earlier backup (which we'll call backup A), stopping all the chef process and couchdb, installing xulrunner-dev; installing all remaining updates and then starting everything up again. At this point Chef and Couch both appeared to be working fine. I took a backup of the box in this 'working' state, which we'll call backup B.
However although the box appeared to be working, attempting to run status/restart/stop via. service couchdb caused a hang again - no output. When I rebooted the box CouchDB didn't start, and again, service couchdb start
just hangs. I then restored the box from backup B, but when it boots CouchDB does not start - same issues. Nothing is added to the couchdb log file, or output if I run the command manually.
In its current state I have:
- CouchDB: 0.10.0-1ubuntu2
- xulrunner: 1.9.2.24+build2+nobinonly-0ubuntu0.10.04.1
If I run strace /usr/bin/couchdb
the last few lines output are:
stat("/var/lib/couchdb", {st_mode=S_IFDIR|0770, st_size=4096, ...}) = 0
stat(".", {st_mode=S_IFDIR|0770, st_size=4096, ...}) = 0
open("/usr/bin/couchdb", O_RDONLY) = 3
fcntl(3, F_DUPFD, 10) = 10
close(3) = 0
fcntl(10, F_SETFD, FD_CLOEXEC) = 0
rt_sigaction(SIGINT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGINT, {0x408189, ~[RTMIN RT_1], SA_RESTORER, 0x7f2a7ba7caf0}, NULL, 8) = 0
rt_sigaction(SIGQUIT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGQUIT, {SIG_DFL, ~[RTMIN RT_1], SA_RESTORER, 0x7f2a7ba7caf0}, NULL, 8) = 0
rt_sigaction(SIGTERM, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGTERM, {SIG_DFL, ~[RTMIN RT_1], SA_RESTORER, 0x7f2a7ba7caf0}, NULL, 8) = 0
read(10, "#! /bin/sh -e\n\n# Licensed under "..., 8192) = 8192
pipe([3, 4]) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f2a7c2069d0) = 1463
close(4) = 0
read(3,
...and then it hangs.
If I run strace xulrunner --gre-version
the last few lines of output are:
open("/proc/cpuinfo", O_RDONLY) = 3
mmap(NULL, 16384, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7feeee879000
open("/etc/ld.so.cache", O_RDONLY) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=33168, ...}) = 0
mmap(NULL, 33168, PROT_READ, MAP_PRIVATE, 4, 0) = 0x7feeee84b000
munmap(0x7feeee84b000, 33168) = 0
close(4) = 0
futex(0x7feeec0760ec, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0x7feeeea980a0, FUTEX_WAIT_PRIVATE, 2, NULL
...and then it hangs.
I have also tried:
- Setting up an ldconfig file as described here: http://wiki.apache.org/couchdb/Installing_on_Ubuntu
- Adding the backports repos and attempting to install the later version of CouchDB (fails as the update process tries to restart couchdb, which hangs)
- Restoring from backup A, preventing xulrunner from updating by putting a 'hold' on the package
- Reinstalling xulrunner via. apt (fails because the reinstall process hangs)
- Changing the couch config files to increase log level to 'debug' - still no output
- Ensuring all the permissions and ownerships for all of the couch directories are set appropriately
Any help appreciated.