Apache crashing at random intervals. Can not find a reason in log files

Question

We are having an issue with a VPS running plesk 9.5 on ubuntu 8.04 At seemingly random intervals Apache will disappear and needs to be started manually. I have checked the apache error log, /var/log/messages, individual virtual host apache error files and cannot find anything that coincides with the time of the failure. dmesg is empty which is a bit odd.

We have also had the psa service go down for no apparent reason but apache stay up.

I'm at a loss to diagnose this really because all the log files I can find do not point to any issues. Are there any others I can look at?

Memory usage sits at about 55% (out of 400mb) and it isn't a particularly high trafficed server.

Any pointers as to where else I can find out what is going on would be very much appreciated.

Nick

Update:

I have been running watchdog for a while now and that is restarting processes when they go down. Unfortunately it is quite often more than apache that goes down (although sometimes it is just apache) There seems to be no pattern to it. We also get courier and qmail going down. Anyway, I have upped the logging level for apache and noticed the following

[Mon Mar 07 16:46:14 2011] [info] server seems busy, (you may need to increase StartServers, or Min/MaxSpareServers), spawning 8 children, there are 0 idle, and 21 total children
[Mon Mar 07 16:49:56 2011] [info] server seems busy, (you may need to increase StartServers, or Min/MaxSpareServers), spawning 8 children, there are 0 idle, and 12 total children
[Mon Mar 07 16:50:08 2011] [info] server seems busy, (you may need to increase StartServers, or Min/MaxSpareServers), spawning 8 children, there are 0 idle, and 28 total children
[Mon Mar 07 16:50:09 2011] [info] server seems busy, (you may need to increase StartServers, or Min/MaxSpareServers), spawning 16 children, there are 0 idle, and 36 total children
[Mon Mar 07 16:50:14 2011] [info] [client ipaddressofserver] (32)Broken pipe: core_output_filter: writing data to the network
[Mon Mar 07 16:50:14 2011] [info] removed PID file /var/run/apache2.pid (pid=9556)
[Mon Mar 07 16:50:14 2011] [notice] caught SIGWINCH, shutting down gracefully
[Mon Mar 07 16:50:18 2011] [emerg] (22)Invalid argument: mod_fcgid: can't get lock, pid: 9557
[Mon Mar 07 16:50:24 2011] [info] Init: Seeding PRNG with 0 bytes of entropy

I have already been increasing the minmaxspareservers but slowly and keeping an eye on memory usage. Surely that can't be causing apache, courier and qmail to fail?

Any help on the log entries and what they indicate would be appreciated

Cheers Nick

Have you checked your kernel message buffer for any exceptions? `dmesg` for most Linux OS's. — Jeff Stice-Hall, Feb 18 '11 at 04:48
Oddly it is empty. Ubuntu does use dmesg though. Any ideas why it might be empty? — Nick Downton, Feb 18 '11 at 17:35

score 3 · Answer 1 · answered Feb 17 '11 at 21:12

3

I suggest you log on to the server via a terminal, run {screen} and run httpd in foreground mode. If and when it crashes there should be some kind of a clue in the console messages.

Of course, man screen and man httpd first for best results.

HTH

answered Feb 17 '11 at 21:12

ztron

317
1
8

score 1 · Answer 2 · answered Feb 18 '11 at 03:20

1

You can also try increasing the LogLevel to capture more details in the error log. If you don't have much traffic you can increase it right away to "Debug".

answered Feb 18 '11 at 03:20

uesp

3,414
1
18
16

score 1 · Answer 3 · answered Feb 18 '11 at 14:12

1

May be your VPS hit a limis on peak? if you runs under Virtuozzo with UBC limita you can check /proc/user_beancounters file - failcnt column should not have any non-zero values. Also, you can set up a Watchdog system monitoring module in Plesk,to automatically starts services which goes down.

answered Feb 18 '11 at 14:12

Michael Zarubin

131
1

failcnt column looks ok (all zero) I will have a play with watchdog Cheers – Nick Downton Feb 18 '11 at 17:33
I am having the same problem but Watchdog saves me. It checks the process state every 5 minutes; in case Apache is down it starts it over. However, I'm still stuck with why the error happens over and over... – Danijel Jul 01 '14 at 09:28

Apache crashing at random intervals. Can not find a reason in log files

3 Answers3