9

I'm trying to diagnose a bizarre server crashing issue (Server responds to pings, but won't accept SSH connections until reboot. 0% CPU) where rebooting the server gets everything back to normal. I'd like to have my Apache access logs (or some other log) include all of the requests that had been made right as the crash happened, but unfortunately Apache doesn't log requests until after they complete. Meaning that if a request is crashing the server, that request never finishes and thus doesn't show up in the logs.

Is there some way to configure Apache to create a log file that gets written to as the requests arrive?

Ben Dilts
  • 265
  • 4
  • 10
  • Take a look at your DNS config as well, a broken or hanging DNS server can cause all kinds of issues with SSH & not able to log in. [who knows what it can do to apache - make sure host name lookups are off.] – Sean Kimball Aug 09 '11 at 17:45

2 Answers2

15

I think what you need forensic logging, see this link: http://httpd.apache.org/docs/current/mod/mod_log_forensic.html

snippet:

Forensic Log Format:

Each request is logged two times. The first time is before it's processed further (that is, after receiving the headers). The second log entry is written after the request processing at the same time where normal logging occurs.

In order to identify each request, a unique request ID is assigned. This forensic ID can be cross logged in the normal transfer log using the %{forensic-id}n format string. If you're using mod_unique_id, its generated ID will be used.

The first line logs the forensic ID, the request line and all received headers, separated by pipe characters (|). A sample line looks like the following (all on one line):

+yQtJf8CoAB4AAFNXBIEAAAAA|GET /manual/de/images/down.gif HTTP/1.1|Host:localhost%3a8080|User-Agent:Mozilla/5.0 (X11; U; Linux i686; en-US; rv%3a1.6) Gecko/20040216 Firefox/0.8|Accept:image/png, etc...

The plus character at the beginning indicates that this is the first log line of this request. The second line just contains a minus character and the ID again:

-yQtJf8CoAB4AAFNXBIEAAAAA

The check_forensic script takes as its argument the name of the logfile. It looks for those +/- ID pairs and complains if a request was not completed.

sandroid
  • 1,724
  • 12
  • 16
  • This is what I was looking for, but it doesn't look like I can just wipe out this log periodically (rm /var/log/apache2/forensic.log every hour) or forensic logging just breaks until I restart apache. Ideas? – Ben Dilts Aug 12 '11 at 20:00
  • certainly if you rm any apache logs, that'll happen. Try using logrotate or whatever equivalent exists in your environment to roll logs periodically (time or size-wise). If not, rather than deleting it, try moving the file, then touch /var/log/apache2/forensic.log (check permissions though) – sandroid Aug 12 '11 at 21:54
2

Sandroid's answer is right for your question but you should really consider running a network packet capture as well. If you have the resources, mirroring the port within the switch and using a second machine running WireShark to record all IP traffic might really help. If something is taking out the server to the point where non-HTTP services aren't responding then it might be taking out the IP stack before the bad data makes it to Apache.

If you can make sure the machine running the capture has different network hardware and/or drivers than the other. Would suck to crash two for the price of one. ;-)

Mark
  • 2,248
  • 12
  • 15