- Do you have to worry about the file that's named being a symlink to a file somewhere else in the file system?
- Do you have to worry about the file that's named being one of multiple links to an inode (
st_nlink > 1
).
- Do you need to worry about extended attributes?
- Do you need to worry about ACLs?
- Does the user ID and group IDs of the current process permit the process to write in the directory where the file is stored?
- Is there enough disk space available for both the old and the new files on the same file system?
Each of these issues complicates the operation.
Symlinks are relatively easy to deal with; you simply need to establish the realpath()
to the actual file and do file creation operations in the directory containing the real path to the file. From here on, they're a non-issue.
In the simplest case, where the user (process) running the operation owns the file and the directory where the file is stored, can set the group on the file, the file has no hard links, ACLs or extended attributes, and there's enough space available, then you can get atomic operation with more or less the sequence outlined in the question — you'd do group and permission setting before executing the atomic rename()
operation.
There is an outside risk of TOCTOU — time of check, time of use — problems with file attributes. If a link is added between the time when it is determined that there are no links and the rename operation, then the link is broken. If the owner or group or permissions on the file change between the time when they're checked and set on the new file, then the changes are lost. You could reduce the risk of that by breaking atomicity but renaming the old file to a temporary name, renaming the new file to the original name, and rechecking the attributes on the renamed old file before deleting it. That is probably an unnecessary complication for most people, most of the time.
If the target file has multiple hard links to it and those links must be preserved, or if the file has ACLs or extended attributes and you don't wish to work out how to copy those to the new file, then you might consider something along the lines of:
- write the output to a named temporary file in the same directory as the target file;
- copy the old (target) file to another named temporary file in the same directory as the target;
- if anything goes wrong during steps 1 or 2, abandon the operation with no damage done;
- ignoring signals as much as possible, copy the new file over the old file;
- if anything goes wrong during step 4, you can recover from the extra backup made in step 2;
- if anything goes wrong in step 5, report the file names (new file, backup of original file, broken file) for the user to clean up;
- clean up the temporary output file and the backup file.
Clearly, this loses all pretense at atomicity, but it does preserve links, owner, group, permissions, ACLS, extended attributes. It also requires more space — if the file doesn't change size significantly, it requires 3 times the space of the original file (formally, it needs size(old) + size(new) + max(size(old), size(new))
blocks). In its favour is that it is recoverable even if something goes wrong during the final copy — even a stray SIGKILL — as long as the temporary files have known names (the names can be determined).
Automatic recovery from SIGKILL probably isn't feasible. A SIGSTOP signal could be problematic too; a lot could happen while the process is stopped.
I hope it goes without saying that errors must be detected and handled carefully with all the system calls used.
If there isn't enough space on the target file system for all the copies of the files, or if the process cannot create files in the target directory (even though it can modify the original file), you have to consider what the alternatives are. Can you identify another file system with enough space? If there isn't enough space anywhere for both the old and the new file, you clearly have major issues — irresolvable ones for anything approaching atomicity.
The answer by Nominal Animal mentions Linux capabilities. Since the question is tagged POSIX and not Linux, it isn't clear whether those are applicable to you. However, if they can be used, then CAP_LEASE
sounds useful.
- How crucial is atomicity vs accuracy?
- How crucial is POSIX compliance vs working on Linux (or any other specific POSIX implementation)?