21

I'd like to know if the writes upon a single file get done atomically such that write("bla bla") and a subsequent write("herp derp") to the same file never results in interleaving, e.g. "bla herp bla derp". Assuming these writes happen in different processes or threads, what governs which gets done first?

Also, does a read() always return data reflecting the file in a state of all previous writes fully completed (whether the data has been actually written to disk or not)? For example, after write("herp derp"), will all subsequent reads always reflect the full data written to the file, or will a subsequent read sometimes reflect only "herp" but not "derp" (or sometimes reflect none of the data at all)? What if the reads and writes occur in different processes/threads?

I'm not interested in concurrent file access strategies. I just want to know what read and write do exactly.

Jegschemesch
  • 11,414
  • 4
  • 32
  • 37
  • You question is hilarious and exactly what I was going to ask. lol... – Anthony Dec 23 '13 at 14:35
  • OP posted a follow-up question [Does each Unix file description have its own read/write buffers?](http://stackoverflow.com/q/5201543/95735) – Piotr Dobrogost Dec 05 '14 at 12:47
  • You might be interested in the Linux kernel [thread](http://thread.gmane.org/gmane.linux.kernel/1649458) titled *Update of file offset on write() etc. is non-atomic with I/O* which led to this commit – http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=9c225f2655e36a470c4f58dbbc99244c5fc7f2d4 – Piotr Dobrogost Dec 05 '14 at 13:30

1 Answers1

11

Separate write() calls are processed separately, not as a single atomic write transaction, and interleaving is entirely possible when multiple processes/threads are writing to the same file. The order of the actual writes is determined by the schedulers (both kernel process scheduler, and for "green" threads the thread library's scheduler).

Unless you specify otherwise (O_DIRECT open flag or similar, if supported), read() and write() operate on kernel buffers and read() will use a loaded buffer in preference to reading the disk again.

Note that this may be complicated by local file buffering; for example, stdio and iostreams will read file data by blocks into a buffer in the process which is independent of kernel buffers, so a write() from elsewhere to data that are already buffered in stdio won't be seen. Likewise, with output buffering there won't be any actual kernel-level output until the output buffer is flushed, either automatically because it has filled up or manually due to fflush() or C++'s endl (which implicitly flushes the output buffer).

geekosaur
  • 59,309
  • 11
  • 123
  • 114
  • Thx, you sound like you know what you're talking about :) I was under the impression that stdio's read/write are direct wrappers of sys_write/sys_read? Is this not the case? Also, what is meant by this in the linux docs: "POSIX requires that a read(2) which can be proved to occur after a write() has returned returns the new data." – Jegschemesch Mar 05 '11 at 01:30
  • Do separate file descriptions use separate read/write buffers in the kernel? If I read in one process, will it possibly reflect data written into the write buffer from another process? Or is there always one kernel buffer per file no matter how many descriptions open on it in however many processes/threads? (btw, I use "description" to refer to the underlying representation of an open file, not the "descriptor" number used as a handle in a process, e.g. dup() returns a new descriptor for an existing description) – Jegschemesch Mar 05 '11 at 01:38
  • @Jegschemesch That means if you `write` data, then `read` it immediately after, it will _always_ read the new data. – alternative Mar 05 '11 at 01:38
  • @mathepic Right, but under what circumstances can a read be "proved to occur after a write()"? – Jegschemesch Mar 05 '11 at 01:40
  • @Jegschemesch when it is guaranteed. By proved it means the code proves that property. – alternative Mar 05 '11 at 01:43
  • @mathepic So it sounds just like @geekosaur said about read() reading the buffer by default. I'm still curious then how many buffers get involved for a single file: always one for all processes, one per process, or one per description? Maybe I'll open a separate question. In any case, it seems like the practical lesson is to always lock files or otherwise synchronize when concurrent writes are a possibility. – Jegschemesch Mar 05 '11 at 01:57
  • 1
    The answer to that depends on the details of the kernel. In most modern Unix-like systems, file blocks are buffered on a global basis; that is, no matter how many processes have how many file descriptors open on a given file, buffering is done by block addresses and shared among all of them. The exception is `O_DIRECT`, as I mentioned earlier, or access to a raw device instead of a block device (`ls -l` shows `c` or `b`, respectively, instead of `-` for a file or `d` for a directory). – geekosaur Mar 05 '11 at 03:53
  • 2
    Commit message in this commit – http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=9c225f2655e36a470c4f58dbbc99244c5fc7f2d4 – suggests `write()` calls are atomic, doesn't it? – Piotr Dobrogost Dec 05 '14 at 13:28
  • @PiotrDobrogost It does. This bugfix is also documented in [write(2)](http://man7.org/linux/man-pages/man2/write.2.html) under the `BUGS` section (I've verified that the version they mention, 3.14, matches, so it's definitely about that commit). – tne Jul 07 '15 at 19:05
  • @PiotrDobrogost Yes, nice find – étale-cohomology Aug 03 '21 at 13:28