2

I wrote this code in C:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

void random_seed(){
    struct timeval tim;
    gettimeofday(&tim, NULL);
    double t1=tim.tv_sec+(tim.tv_usec/1000000.0);
    srand (t1);
}

void main(){
    FILE *f;
    int i;
    int size=100;
    char *buf=(char*)malloc(size);

    f = fopen("output.txt", "a");
    setvbuf (f, buf, _IOFBF, size);
    random_seed();

    for(i=0; i<200; i++){
          fprintf(f, "[ xx - %d - 012345678901234567890123456789 - %d]\n", rand()%10, getpid());
          fflush(f);
    }

    fclose(f);
    free(buf);
}

This code opens in append mode a file and attaches 200 times a string. I set the buf of size 100 that can contains the full string. Then I created multi processes running this code by using this bash script:

#!/bin/bash

gcc source.c
rm output.txt

for i in `seq 1 100`;
do
    ./a.out &
done

I expected that in the output the strings are never mixed up, as I read that when opening a file with O_APPEND flag the file offset will be set to the end of the file prior to each write and i'm using a fully buffered stream, but i got the first line of each process is mixed as this:

[ xx - [ xx - 7 - 012345678901234567890123456789 - 22545]

and some lines later

2 - 012345678901234567890123456789 - 22589]

It looks like the write is interrupted for calling the rand function.

So...why appear these lines? Is the only way to prevent this the use file locks...even if i'm using only the append mode?

Thanks in advance!

2 Answers2

2

You will need to implement some form of concurrency control yourself, POSIX makes no guarantees with respect to concurrent writes from multiple processes. You get some guarantees for pipes, but not for regular files written to from different processes.

Quoting POSIX write():

This volume of POSIX.1-2008 does not specify behavior of concurrent writes to a file from multiple processes. Applications should use some form of concurrency control.

(At the end of the Rationale section.)

Mat
  • 202,337
  • 40
  • 393
  • 406
  • thanks for the link! In the same page is written that "If the O_APPEND flag of the file status flags is set, the file offset shall be set to the end of the file prior to each write and no intervening file modification operation shall occur between changing the file offset and the write operation." – Kevin Peron Jan 22 '12 at 16:31
  • 1
    I have just seen that when I set the buffer size to a value greater than 128 it works as i expected... when I use less than 128 looks like setvbuf has an incorrect result – Kevin Peron Jan 22 '12 at 16:33
  • 1
    You should read that as `intervening file modification operation shall occur between changing the file offset and the write operation` **from within that process**. There is/can be locking within the process to prevent multiple threads from writing at the same time. There is no such guaranteed between processes. Another process could do things between the file offset change and the write. – Mat Jan 22 '12 at 16:33
  • Re your second comment: it does not work as you expect. It _appears to work that time you started it_. There is no guarantee it works with any buffer size. (But it should work pretty often if you disable buffering altogether.) – Mat Jan 22 '12 at 16:35
  • #include #include int main(void){ int size=128; char* buf=malloc(size); setvbuf (stdout, buf, _IOFBF, size); printf("are 5 seconds past?\n"); sleep(5); } – Kevin Peron Jan 22 '12 at 16:40
  • You are right about guarantees. Anyway setvbuf on my system does not work correctly when buffer size is less than 128 bytes (looks at the example). – Kevin Peron Jan 22 '12 at 16:42
0

You open the file in the fully buffered mode. That means that every line of the output first goes into the buffer and when the buffer overflows it gets flushed to the file regardless whether it contains incomplete lines. That causes chunks of output from different processes writing into the same file concurrently to be interleaved.

An easy fix would be to open the file in line buffered mode _IOLBF, so that the buffer gets flushed on each complete line. Just make sure that the buffer size is at least as big as your longest line, otherwise it will end up writing incomplete lines. The buffer is normally flushed with a single write() system call, so that lines from different processes won't interleave each other.

There is no guarantee that write() system call is atomic for different filesystems though, but it normally works as expected because write() normally locks the file descriptor in the kernel with a mutex before proceeding.

Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271
  • Luckily, there is `_IOLBF` that makes `fflush()` unnecessary. The fewer lines in code, the harder for an error to hide. – Maxim Egorushkin Jan 22 '12 at 20:16
  • `_IOLBF` doesn't fix it for me, only `_IONBF` /* No buffering. */ does. (note: using the same program as OP with 100 bytes buffer) –  May 20 '19 at 11:35
  • @howaboutsynergy You may like to post another question with your code. – Maxim Egorushkin May 20 '19 at 11:37
  • @MaximEgorushkin the code is already available in OP(ie. in the question), I simply replaced `_IOFBF` with `_IOLBF` as per your answer: `An easy fix would be to open the file in line buffered mode _IOLBF, so that the buffer gets flushed on each complete line. Just make sure that the buffer size is at least as big as your longest line, otherwise it will end up writing incomplete lines.`. I was merely pointing out that this is, apparently, not true, for me. I notice `libio/fileops.c` has a comment stating: `We used to flush all line-buffered stream. This really isn't required by any standard.` –  May 20 '19 at 12:56
  • I get no interleaves with any of the 3 modes if buffer size >=[128](https://stackoverflow.com/questions/8960611/c-multi-processes-stdio-append-mode/8961581?noredirect=1#comment11223841_8960709). I'll look into why, but at first sight `libio/fileops.c` shows only one occurrence of `128` for writes, but it'll take time for me to understand the logic. `/* Try to maintain alignment: write a whole number of blocks. */` –  May 20 '19 at 14:19
  • @howaboutsynergy That ancient comment by Drepper refers to flushing all of the streams vs one stream. You shouldn't cut quotes out of context because the meaning gets lost. Broken line-bufferring would certainly break many applications. – Maxim Egorushkin May 20 '19 at 14:40