Does writing a file to disk with Python open().write() ensure the data is available to other processes?

Question

One Python process writes status updates to a file for other processes to read. In some circumstances, the status updates happen repeatedly and quickly in a loop. The easiest and fastest approach is to use to open().write() in one line:

open(statusfile,'w').write(status)

An alternate approach with four lines that force the data to disk. This has a significant performance penalty:

f = open(self.statusfile,'w')

f.write(status)

os.fsync(f)

f.close()

I'm not trying to protect from an OS crash. So, does the approach force the data to the OS buffer so other processes read the newest status data when they open the file from disk? Or, do I need to use os.fsync()?

The reason why your second variant is slow is that you use `os.fsync()` to force the data to disc instead of just flushing the buffers. You could even leave the file open all the time, calling `f.flush()` whenever necessary. This should even speed things up significantly. — Sven Marnach, Oct 31 '11 at 16:19

Fred Foo · Answer 1 · 2011-10-31T23:22:22.793

No, the first approach does not guarantee that the data is written out, since it is not guaranteed that the file will be flushed and closed once the handle is no longer referenced by its write member. This is likely the case with CPython, but not necessarily with other Python interpreters; it's an implementation detail of the Python garbage collector.

You should really use the second approach, except that os.fsync is not needed; just close the file and the data should be available to other processes.

Or, even better (Python >=2.5):

with open(self.statusfile, 'w') as f:
    f.write(status)

The with version is exception-safe: the file is closed even if write fails.

Note that closing the file does not imply an `fsync()` -- it only implies an `fflush()`, but the latter is all you need to be able to see the changes to the file in other processes. — Sven Marnach, Oct 31 '11 at 15:30

score 0 · Answer 2 · answered Oct 31 '11 at 18:34

0

Since Python 2.2 it's been possible to subclass the language's built-in types. This means you could derive your own file type whose write() method returned self instead of nothing like the built-in version does. Doing so would make it possible to also chain a close() method call onto the end of your one-liner.

class ChainableFile(file):
    def __init__(self, *args, **kwargs):
        return file.__init__(self, *args, **kwargs)

    def write(self, str):
        file.write(self, str)
        return self

def OpenFile(filename, *args, **kwargs):
    return ChainableFile(filename, *args, **kwargs)

statusfile = 'statusfile.txt'
status = 'OK\n'

OpenFile(statusfile,'w').write(status).close()

answered Oct 31 '11 at 18:34

martineau

119,623
25
170
301

Nice trick, but not exception-safe. If `write` fails, the file may stay opened. – Fred Foo Oct 31 '11 at 23:21
@larsmans: Could easily be made exception-safe by adding exception handling to the `ChainableFile::write()` method. If one occurs there the file can be closed before returning `self` since doing so again with the trailing `.close()` would be harmless. – martineau Nov 01 '11 at 19:22

Does writing a file to disk with Python open().write() ensure the data is available to other processes?

2 Answers2