15

I'm writing a python script that copies a file using shutil.copyfile() on Linux. During the copying, other processes may be trying to read the file. Is the following sufficient to ensure that an external process doesn't get a corrupted view of the file?

os.unlink(dest)
shutil.copyfile(src, dest)

That is, is shutil.copyfile() atomic such that other processes cannot read the destination file until after the copy operation is complete?

dinosaur
  • 3,164
  • 4
  • 28
  • 40
  • 2
    It's better to copy to a temp file in the same directory, then os.rename(). *ix: This way it's atomic, even on NFS. If a process or processes already had the file open, they'll continue to see the old version, while subsequent open()'s of the file will see the new content. – dstromberg Jan 01 '14 at 22:17

2 Answers2

9

No, shutil.copyfile is not atomic. This is part of the definition of shutil.copyfile:

def copyfile(src, dst, *, follow_symlinks=True):    
    ...
    with open(src, 'rb') as fsrc:
        with open(dst, 'wb') as fdst:
            copyfileobj(fsrc, fdst)

where copyfileobj is defined like this:

def copyfileobj(fsrc, fdst, length=16*1024):
    while 1:
        buf = fsrc.read(length)
        if not buf:
            break
        fdst.write(buf)

The thread calling copyfile could be stopped inside this while-loop at which point some other process could try to open the file to be read. It would get a corrupted view of the file.

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
7

No, it seems to just loop, reading and writing 16KB at a time.

For an atomic copy operation, you should copy the file to a different location on the same filesystem, and then os.rename() it to the desired location (which is guaranteed to be atomic on Linux).