1

We have a situation where files are uploaded to a folder by FTP and then served by nginx. We've found that if the GET request immediately follows the modification of the file nginx returns files with 0 bytes.

In trying to debug this problem I wrote 2 python scripts to see if I could reproduce the error in a simple way.

The first one writes to a file

  while True:
      with open('testfile' , 'w') as f:
          f.write("test")

And the second one reads

  while True:
      with open('testfile' , 'r') as cf:
          print(cf.read())

when running these files in 2 separate processes the output of the reader is either "test" or "" indicating that sometimes the file seems empty to the reader. This does not seem to be related to the python implementation as I can reproduce the effect with bash like this:

(writer.sh)

  while true; do
      echo test > testfile
  done

(reader.sh)

  while true; do
      cat testfile
      printf "\n"
  done

The file system is ext4 and the OS is Ubuntu 16.04.

So:

Why does the reader sometimes see an empty file (around 50% of the time)?

Why do we never see a partial write ("te", "tes" etc)?

Thanks in advance for you help.

Russell
  • 1,158
  • 3
  • 9
  • 8

4 Answers4

3

Kudos, you've just discovered file buffering. When writing to disk, you can either use buffered writes or direct I/O writes. For performance reasons, most software (including the Python interpreter) defaults to buffered writes. If you need to perform direct I/O, there is a nice python module aptly named directio that does just that.

However most of the time you don't need direct I/O, unless you're writing to some log file, or a database.

wazoox
  • 6,918
  • 4
  • 31
  • 63
2

Others have described how this is buffered I/O, where you see the truncated file before its contents were flushed.

Some more details on a couple ways to address this:

Upload files to a temporary directory on the same file system as the target, then mv into place. The rename is an atomic operation, so readers will only see the old file or the new file, not something in between. However, the kernel still gets around to finishing writes to disk on its schedule, unless the application calls fsync(). Closing the file or waiting some arbitrary time does not reliably cause the file to be on disk.

Or, change the application to be backed by a database. Let the database provide a consistent view of the document in memory and on storage, that's what they do. Possibly not worth the implementation effort if the only reason is to get rid of very small window of inconsistency.

John Mahowald
  • 32,050
  • 2
  • 19
  • 34
1

You are likely experiencing a race condition where:

  • the write truncates the file due to the redirection (">").
  • the file is read by the reader (empty file).
  • the file is written by the writer.

If you put a short sleep in the writer loop, you should see this much less frequently.

You can avoid this by using an atomic action to create the file such as:

while true do;
    echo test > file.tmp
    mv file.tmp testfile
done

Your original code will continually truncate and write the same file. The loop above will continually create new files. The mv command is atomic, and the reader will always see a file with data. This will be either the file deleted by the mv or the new file.

BillThor
  • 27,737
  • 3
  • 37
  • 69
  • With mv the reader won't see an empty file, but may see no file at all instead of a file with data -- rename is atomic, but delete then rename isn't. – dave_thompson_085 Apr 22 '18 at 09:58
  • @dave_thompson_085 `> testfile` will truncate the existing file. This is done before the `echo test` is executed. The write of `test` should be atomic. – BillThor Apr 26 '18 at 03:18
  • Your 'mv' method doesn't do `>testfile`. It does `>file.tmp` which creates _and_ writes a new file, then does `mv file.tmp testfile` where both files are existing, which has the effect of deleting (actually unlinking) the older `testfile` then renaming (actually re-linking) the recently-created `file.tmp` to become a new instance of `testfile`. Between the delete of the older `testfile` and the linking of the newer `testfile` there is a brief period where no file named `testfile` exists. – dave_thompson_085 Apr 26 '18 at 16:28
  • @dave_thompson_085 `mv` should be atomic at the directory level. The directory should always contain one of the files. (A directory pointer is just a pointer to an inode.) The old file should get deleted after the `mv` when is reference count goes to 0. – BillThor Apr 29 '18 at 20:20
0

Files tend to be written on blocks (subsets of the full data), the size of these blocks is determined by a combination of the function being used and the available system resources, from this the OS will usually try to optimize the block size. What happens is that the block will get written into RAM before being written to disk and then a full block will be written to disk at once with no time for a read during that block write. This leads to much faster write then otherwise possible.

In your case writing the word, "test" is going to be smaller than any block size the OS picks so it will all be written all at once. For your test, you should write a much larger test and probably set the block size (though it's best to let the OS decide most of the time).

I suspect what is happening in your test is that half the time you are catching the empty file before it is written to and the other half it is catching it after the block is written. If you try writing an amount of data larger than your block size I think you will see the partially written fils.

Jeff
  • 223
  • 1
  • 2
  • 15