8

I have an application that must log each transaction. Every log message is flushed because we need to have a record of what occurred leading up to a crash. My colleagues and I were curious about how to achieve the performance effects of buffering while guaranteeing that the log message had left the process.

What we came-up with is:

  • make a FIFO that the application can write to, and
  • redirect the contents of that FIFO to a regular file via cat.

That is, what was ordinarily:

app --logfile logfile.txt

is now:

mkfifo logfifo
cat logfifo &> logfile.txt &
app --logfile logfifo

Are there any gotchas to this approach? It worked when we tested it, but we want to make absolutely sure that the messages will find their way to the redirect file even if the original application crashes.

(We don't have the source code to the application, so programming solutions are out of the question. Also, the application won't write to stdout, so piping directly to a different command is out of the question. So syslog is not a possibility.)


Update: I've added a bounty. The accepted answer will not involve logger for the simple reason that logger is not what I've asked about. As the original question states, I am only looking for gotchas in using a FIFO.

chrisaycock
  • 133
  • 1
  • 1
  • 10
  • There is seems to be some immense confusion in a few of the answers so far. As I had originally highlighted in my question, I am looking for **gotchas to the approach of using a FIFO as a logging intermediary**. I am not interested in alternative suggestions that don't at least explore the **gotchas** of my original question. – chrisaycock Feb 20 '12 at 17:55
  • If my answer about logging to stdout is not a direction you're interested in, could you perhaps remove "Also, the application won't write to stdout, so piping directly to a different command is out of the question" from the question? – nickgrim Feb 21 '12 at 09:34

6 Answers6

6

Note that a fifo is typically necessary in programming where the amount written in can surpass the amount read out.

As such a fifo wont work entirely smoothly as you anticipate but would solve your main problem whilst introducing another.

There are three possible caveats.

  1. Writing to the fifo is blocked indefinitely if nothing is reading the other end at initialization.
  2. The fifo has a fixed width of 64K by which if the buffer is filled by that point further writes will block until the reader has caught up.
  3. The pipe writer will by killed with SIGPIPE if the reader dies or exits.

This will mean that your problem (emulate buffered I/O on a non-buffered write) will be resolved. That is because the new 'limit' on your FIFO will in effect become the speed of whatever utility is writing what is in the pipe to disk (which presumably will be buffered I/O).

Nevertheless, the writer becomes dependant on your log reader to function. If the reader stops suddently reading, the writer will block. If the reader suddenly exits (lets say you run out of disk space on your target) the writer will SIGPIPE and probably exit.

Another point to mention is if the server panics and the kernel stops responding you may lose up to 64k of data that was in that buffer.

Another way to fix this will be to write logs to tmpfs (/dev/shm on linux) and tail the output to a fixed disk location. There are less restrictive limits on memory allocation doing this (not 64K, typically 2G!) but might not work for you if the writer has no dynamic way to reopen logfiles (you would have to clean out the logs from tmpfs periodically). If the server panics in this method you could lose a LOT more data.

Matthew Ife
  • 23,357
  • 3
  • 55
  • 72
  • That's some good info regarding `/dev/shm`. We had considered that too, though it had slipped my mind to use `tail -f`. – chrisaycock Feb 23 '12 at 15:52
4
mkfifo logfifo
cat logfifo &> logfile.txt &
app --logfile logfifo

What happens when your cat logfifo process dies, someone kills it by accident, or someone accidentally points it to the wrong location?

My experience is that the app will quickly block and hang. I've tried this with Tomcat, Apache and a few small homebuilt applications, and ran into the same problem. I never investigated very far, because logger or simple I/O redirection did what I want. I usually don't need the logging completeness, which is what you are trying to pursue. And as you say, you don't want logger.

There is some discussion on this problem at Linux non-blocking fifo (on demand logging).

Stefan Lasiewski
  • 23,667
  • 41
  • 132
  • 186
  • Interesting. So the app could block anyway? We're thinking of throwing-in the towel and ordering Fusion IO at this point. – chrisaycock Feb 22 '12 at 00:41
  • Give it a try. For me, this result happened quickly (in a test environment on low spec hardware). This was on Ubuntu and RedHat 5. Not sure if FreeBSD, etc. have similar behavior. – Stefan Lasiewski Feb 22 '12 at 01:23
2

The environment is CentOS and the application writes to a file...

Instead of sending to a regular file, I'd send the output to syslog and make sure that syslog messages are being sent to a central server as well as locally.

You should be able to use a shell script like this:

logger -p daemon.notice -t app < fifo

You could also take input (to logger) from cat or from tail -f into a pipe:

tail -f fifo | logger ...
cat fifo | logger ...

The only problem is that it doesn't differentiate based on the importance of the log message (everything is NOTICE) but at least it gets logged and also sent off-host to a central server.

Configuring syslog depends on which server you are using. There is rsyslog and syslog-ng (both very capable).

EDIT: Revised after getting more information from poster.

Mei
  • 4,590
  • 8
  • 45
  • 53
  • The log messages are written to a file that we can select, but the application won't write to `stdout`. The OS is CentOS (I had tagged the question as *linux*, but I guess that's easy to miss). All messages are of equal importance, so there is no difference between a notice and an error in this application. – chrisaycock Feb 15 '12 at 20:42
  • No difference between a notice and an error? Oh _really_? No difference between `Slow query detected` and `Database halted`? – Mei Feb 15 '12 at 22:36
  • (I updated the post.) – Mei Feb 15 '12 at 22:40
2

Your options are fairly limited by the app, but what you've tested will work.

We do something similar with varnish and varnishncsa to the get logs somewhere useful to us. We have a fifo and just read from it with syslog-ng and send it where we need. We handle about 50GB and haven't come across a problem with this approach so far

goo
  • 2,868
  • 19
  • 15
1

You write:

Also, the application won't write to stdout...

Did you try logging to the "file" /dev/stdout? That might enable you to do something like:

app --logfile /dev/stdout | logger -t app
nickgrim
  • 4,466
  • 1
  • 19
  • 28
  • Is that different from a FIFO? What are the gotchas? I'm not looking for an alternative, *per se*; I'm looking for insight into whether the approach I've listed is robust. That's all I'm interested in. – chrisaycock Feb 20 '12 at 17:46
  • 1
    Yes, it's different to a FIFO; it's essentially a file that's connected to STDOUT. So you can use that to log to STDOUT, and from there to syslog or similar. It's intended as a solution to your original problem, but hopefully more robust than a FIFO due to having fewer moving parts. – nickgrim Feb 21 '12 at 08:56
1

There is no point to writing to a fifo and then having another process write it to a file. Just write directly to the file normally and once the write() returns, it's in the kernel's hands; it will still write it to the disk if the application crashes.

psusi
  • 3,347
  • 1
  • 17
  • 9
  • *Every log message is flushed*: these are blocking writes. We don't have any control over that since we don't have the application's source code. – chrisaycock Feb 22 '12 at 00:30
  • @chrisaycock, right... so where's the problem? – psusi Feb 22 '12 at 02:23
  • *My colleagues and I were curious about how to achieve the performance effects of buffering while guaranteeing that the log message had left the process.* – chrisaycock Feb 22 '12 at 03:20
  • @chrisaycock, ohh, so the application currently uses O_SYNC or `fsync()` to wait for each write to hit the disk, and you DON'T want that to happen ( risking loss in the event of a system crash )? – psusi Feb 22 '12 at 03:40
  • `write()` doesn't ensure that the data is on the disk; it merely ensures that the data has been acknowledged by the drive's interface. See [here](http://en.wikipedia.org/wiki/Disk_buffer#Write_acceleration), for example. My goal is to merely to get the data out of the application in case the application crashes, since I can't really guarantee safety in a system crash. Plus, I need to reuse the logs in the future anyway, so `/dev/shm` isn't really appropriate for my needs, though it is something we considered. – chrisaycock Feb 22 '12 at 03:49
  • 1
    @chrisaycock, it has nothing to do with the drive's interface, it normally just means the kernel has copied it to the filesystem cache ( and out of the application ). As I said in my answer originally, once the `write()` has returned, the app can crash all it likes -- the data will make it to the disk as long as the kernel doesn't crash. – psusi Feb 22 '12 at 03:57
  • @psusi - I was just about to post this exact same answer when I looked through the comments thinking, "someone must have explained this already..." – Kyle Smith Feb 22 '12 at 14:26