10

Is it alright for multiple processes to access (write) to the same file at the same time? Using the following code, it seems to work, but I have my doubts.

Use case in the instance is an executable that gets called every time an email is received and logs it's output to a central file.

if (freopen(console_logfile, "a+", stdout) == NULL || freopen(error_logfile, "a+", stderr) == NULL) {
    perror("freopen");
}
printf("Hello World!");

This is running on CentOS and compiled as C.

David Beck
  • 10,099
  • 5
  • 51
  • 88
  • Possible (better) duplicate of [fopen two processes](http://stackoverflow.com/questions/1842909/fopen-two-processes). – blahdiblah Mar 26 '12 at 22:45
  • See also [Can multiple processes append to a file using fopen without any concurrency problems?](http://stackoverflow.com/questions/7552451/can-multiple-processes-append-to-a-file-using-fopen-without-any-concurrency-prob). – blahdiblah Mar 26 '12 at 22:47
  • 2
    I don't know the context of your logs' usage, but I would recommend to take a look at `syslog`. Maybe it suits you. Working with it is really simple. http://www.gnu.org/software/libc/manual/html_node/Submitting-Syslog-Messages.html – Stanislav Yaglo Mar 26 '12 at 22:57

2 Answers2

8

Using the C standard IO facility introduces a new layer of complexity; the file is modified solely via write(2)-family of system calls (or memory mappings, but that's not used in this case) -- the C standard IO wrappers may postpone writing to the file for a while and may not submit complete requests in one system call.

The write(2) call itself should behave well:

   [...] If the file was
   open(2)ed with O_APPEND, the file offset is first set to the
   end of the file before writing.  The adjustment of the file
   offset and the write operation are performed as an atomic
   step.

   POSIX requires that a read(2) which can be proved to occur
   after a write() has returned returns the new data.  Note that
   not all file systems are POSIX conforming.

Thus your underlying write(2) calls will behave properly.

For the higher-level C standard IO streams, you'll also need to take care of the buffering. The setvbuf(3) function can be used to request unbuffered output, line-buffered output, or block-buffered output. The default behavior changes from stream to stream -- if standard output and standard error are writing to the terminal, then they are line-buffered and unbuffered by default. Otherwise, block-buffering is the default.

You might wish to manually select line-buffered if your data is naturally line-oriented, to prevent interleaved data. If your data is not line-oriented, you might wish to use un-buffered or leave it block-buffered but manually flush the data whenever you've accumulated a single "unit" of output.

If you are writing more than BUFSIZ bytes at a time, your writes might become interleaved. The setvbuf(3) function can help prevent the interleaving.

It might be premature to talk about performance, but line-buffering is going to be slower than block buffering. If you're logging near the speed of the disk, you might wish to take another approach entirely to ensure your writes aren't interleaved.

sarnold
  • 102,305
  • 22
  • 181
  • 238
  • Great tip about `setvbuf()` and its variants `setbuf()`, `setbuffer()`, and `setlinebuf()`. They were just what I needed. Thanks, @sarnold. – Randall Cook Mar 29 '12 at 17:15
  • Thanks for pointing out that O_APPEND is needed to guarantee atomic adjustment of the file offset. I omitted it from the first process that open the file, because it also creates it (so 'append' didn't seem appropriate..) – RobM Oct 26 '12 at 20:12
1

This answer was incorrect. It does work:

So the race condition would be:

  1. process 1 opens it for append, then
  2. later process 2 opens it for append, then
  3. later still 1 writes and closes, then
  4. finally 2 writes and closes.

I'd be impressed if that 'worked' because it isn't clear to me what working should mean. I assume 'working' means all of the bytes written by the two processes are inthe log file? I'd expect that they both write starting at the same byte offset, so one will replace the others bytes. It will all be okay upto and including step 3. and only show up as a problem at step 4, Seems like an easy test to write: open getchar ... write close.

Is it critical that they can have the file open simultaneously? A more obvious solution if the write is quick, is to open exclusive.

For a quick check on your system, try:

/* write the first command line argument to a file called foo
 * stackoverflow topic 9880935
 */

#include <stdio.h>
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

int main (int argc, const char * argv[]) {
    if (argc <2) {
        fprintf(stderr, "Error: need some text to write to the file Foo\n");
        exit(1);
    }

    FILE* fp = freopen("foo", "a+", stdout);

    if (fp == NULL) {
        perror("Error failed to open file\n");
        exit(1);
    }

    fprintf(stderr, "Press a key to continue\n");
    (void) getchar();       /* Yes, I really mean to ignore the character */

    if (printf("%s\n", argv[1]) < 0) {
        perror("Error failed to write to file: ");
        exit(1);        
    }

    fclose(fp);

    return 0;
}
gbulmer
  • 4,210
  • 18
  • 20
  • They don't overwrite each other, from `man freopen`: `a+ ... Subsequent writes to the file will always end up at the then current end of file`. – blahdiblah Mar 26 '12 at 22:42
  • @blahdiblah - maybe I am missing something, but how can they **not** overwrite in my example? Both processes open for append, but neither writes any bytes at that stage, and so the file is the same length for both opens. Then they both write. Isn't the file offset an attribute of the fd, and not the file? – gbulmer Mar 26 '12 at 22:49
  • I report only the information in the man page, and results of my own testing. I can't speak to the underlying implementation details. – blahdiblah Mar 26 '12 at 22:53