5

So, the normal POSIX way to safely, atomically replace the contents of a file is:

  • fopen(3) a temporary file on the same volume
  • fwrite(3) the new contents to the temporary file
  • fflush(3)/fsync(2) to ensure the contents are written to disk
  • fclose(3) the temporary file
  • rename(2) the temporary file to replace the target file

However, on my Linux system (Ubuntu 16.04 LTS), one consequence of this process is that the ownership and permissions of the target file change to the ownership and permissions of the temporary file, which default to uid/gid and current umask.

I thought I would add code to stat(2) the target file before overwriting, and fchown(2)/fchmod(2) the temporary file before calling rename, but that can fail due to EPERM.

Is the only solution to ensure that the uid/gid of the file matches the current user and group of the process overwriting the file? Is there a safe way to fall back in this case, or do we necessarily lose the atomic guarantee?

Daniel Pryden
  • 59,486
  • 16
  • 97
  • 135
  • 1
    this answer https://unix.stackexchange.com/a/368641/92787 talks about this problem. It doesn't look like there is a solution for this. You have to choose either atomicity or keeping the owner&permissions. – bolov Feb 28 '18 at 17:02
  • Does ZFS on Linux support the `rstchown` option? That would be a solution if it's supported. – Andrew Henle Mar 02 '18 at 15:07
  • 1
    Nobody is going to thank you for putting effort into this. Using file systems as ad hoc synchronization mechanisms was never a good idea, and in modern systems is a very bad idea. As much as possible, you want your code to work in every possible environment; relying upon single system image semantics works against this. – mevets Mar 06 '18 at 18:10
  • @mevets: You're right, and in fact I'm working on migrating away from a filesystem for this mechanism anyway. However, until the replacement system is in place, I want to do my utmost to avoid file corruption from the existing processes that are already unsafely manipulating shared state in this file, which is what prompted this question in the first place. – Daniel Pryden Mar 08 '18 at 16:46

3 Answers3

5

Is the only solution to ensure that the uid/gid of the file matches the current user and group of the process overwriting the file?

No.

In Linux, a process with the CAP_LEASE capability can obtain an exclusive lease on the file, which blocks other processes from opening the file for up to /proc/sys/fs/lease-break-time seconds. This means that technically, you can take the exclusive lease, replace the file contents, and release the lease, to modify the file atomically (from the perspective of other processes).

Also, a process with the CAP_CHOWN capability can change the file ownership (user and group) arbitrarily.

Is there a safe way to [handle the case where the uid or gid does not match the current process], or do we necessarily lose the atomic guarantee?

Considering that in general, files may have ACLs and xattrs, it might be useful to create a helper program, that clones the ownership including ACLs, and extended attributes, from an existing file to a new file in the same directory (perhaps with a fixed name pattern, say .new-################, where # indicate random alphanumeric characters), if the real user (getuid(), getgid(), getgroups()) is allowed to modify the original file. This helper program would have at least the CAP_CHOWN capability, and would have to consider the various security aspects (especially the ways it could be exploited). (However, if the caller can overwrite the contents, and create new files in the target directory -- the caller must have write access to the target directory, so that they can do the rename/hardlink replacement --, creating a clone file on their behalf with empty contents ought to be safe. I would personally exclude target files owned by root user or group, though.)

Essentially, the helper program would behave much like the mktemp command, except it would take the path to the existing target file as a parameter. It would then be relatively straightforward to wrap it into a library function, using e.g. fork()/exec() and pipes or sockets.

I personally avoid this problem by using group-based access controls: dedicated (local) group for each set. The file owner field is basically just an informational field then, indicating the user that last recreated (or was in charge of) said file, with access control entirely based on the group. This means that changing the mode and the group id to match the original file suffices. (Copying ACLs would be even better, though.) If the user is a member of the target group, they can do the fchown() to change the group of any file they own, as well as the fchmod() to set the mode, too.

Nominal Animal
  • 38,216
  • 5
  • 59
  • 86
  • re: CAP_LEASE: What happens if another process has the file open or mmap()'ed when the process obtains the exclusive lease? – user2962393 Mar 03 '18 at 14:50
  • 1
    @user2962393: If another process has it open or mapped, you don't get the exclusive lease. – Nominal Animal Mar 04 '18 at 14:34
  • All the answers on this question are good, but I think this one deserves the bounty because of the last paragraph: if we ensure that we have a dedicated local group with correct permissions (which in my case we do), then the file owner field becomes less important, and preserving it is not critical. This is ultimately the solution I ended up going with, at least for now. (Longer term I am migrating this particular bit of data into a dedicated database so that we can outsource the whole problem of locking and atomic updates.) – Daniel Pryden Mar 08 '18 at 16:45
2

I am by no means an expert in this area, but I don't think it's possible. This answer seems to back this up. There has to be a compromise.

Here are some possible solutions. Every one has advantages and disadvantages and weighted and chosen depending on the use case and scenario.

  • Use atomic rename.

    Advantage: atomic operation

    Disadvantage: possible to not keep owner/permissions

  • Create a backup. Write file in place

    This is what some text editor do.

    Advantage: will keep owner/permissions

    Disadvantage: no atomicity. Can corrupt file. Other application might get a "draft" version of the file.

  • Set up permissions to the folder such that creating a new file is possible with the original owner & attributes.

    Advantages: atomicity & owner/permissions are kept

    Disadvantages: Can be used only in certain specific scenarios (knowledge at the time of creation of the files that would be edited, the security model must allow and permit this). Can decrease security.

  • Create a daemon/service responsible for editing the files. This process would have the necessary permissions to create files with the respective owner & permissions. It would accept requests to edit files.

    Advantages: atomicity & owner/permissions are kept. Higher and granular control to what and how can be edited.

    Disadvantages. Possible in only specific scenarios. More complex to implement. Might require deployment and installation. Adding an attack surface. Adding another source of possible (security) bugs. Possible performance impact due to the added intermediate layer.

bolov
  • 72,283
  • 15
  • 145
  • 224
2
  • Do you have to worry about the file that's named being a symlink to a file somewhere else in the file system?
  • Do you have to worry about the file that's named being one of multiple links to an inode (st_nlink > 1).
  • Do you need to worry about extended attributes?
  • Do you need to worry about ACLs?
  • Does the user ID and group IDs of the current process permit the process to write in the directory where the file is stored?
  • Is there enough disk space available for both the old and the new files on the same file system?

Each of these issues complicates the operation.

Symlinks are relatively easy to deal with; you simply need to establish the realpath() to the actual file and do file creation operations in the directory containing the real path to the file. From here on, they're a non-issue.

In the simplest case, where the user (process) running the operation owns the file and the directory where the file is stored, can set the group on the file, the file has no hard links, ACLs or extended attributes, and there's enough space available, then you can get atomic operation with more or less the sequence outlined in the question — you'd do group and permission setting before executing the atomic rename() operation.

There is an outside risk of TOCTOU — time of check, time of use — problems with file attributes. If a link is added between the time when it is determined that there are no links and the rename operation, then the link is broken. If the owner or group or permissions on the file change between the time when they're checked and set on the new file, then the changes are lost. You could reduce the risk of that by breaking atomicity but renaming the old file to a temporary name, renaming the new file to the original name, and rechecking the attributes on the renamed old file before deleting it. That is probably an unnecessary complication for most people, most of the time.

If the target file has multiple hard links to it and those links must be preserved, or if the file has ACLs or extended attributes and you don't wish to work out how to copy those to the new file, then you might consider something along the lines of:

  1. write the output to a named temporary file in the same directory as the target file;
  2. copy the old (target) file to another named temporary file in the same directory as the target;
  3. if anything goes wrong during steps 1 or 2, abandon the operation with no damage done;
  4. ignoring signals as much as possible, copy the new file over the old file;
  5. if anything goes wrong during step 4, you can recover from the extra backup made in step 2;
  6. if anything goes wrong in step 5, report the file names (new file, backup of original file, broken file) for the user to clean up;
  7. clean up the temporary output file and the backup file.

Clearly, this loses all pretense at atomicity, but it does preserve links, owner, group, permissions, ACLS, extended attributes. It also requires more space — if the file doesn't change size significantly, it requires 3 times the space of the original file (formally, it needs size(old) + size(new) + max(size(old), size(new)) blocks). In its favour is that it is recoverable even if something goes wrong during the final copy — even a stray SIGKILL — as long as the temporary files have known names (the names can be determined).

Automatic recovery from SIGKILL probably isn't feasible. A SIGSTOP signal could be problematic too; a lot could happen while the process is stopped.

I hope it goes without saying that errors must be detected and handled carefully with all the system calls used.

If there isn't enough space on the target file system for all the copies of the files, or if the process cannot create files in the target directory (even though it can modify the original file), you have to consider what the alternatives are. Can you identify another file system with enough space? If there isn't enough space anywhere for both the old and the new file, you clearly have major issues — irresolvable ones for anything approaching atomicity.

The answer by Nominal Animal mentions Linux capabilities. Since the question is tagged POSIX and not Linux, it isn't clear whether those are applicable to you. However, if they can be used, then CAP_LEASE sounds useful.

  • How crucial is atomicity vs accuracy?
  • How crucial is POSIX compliance vs working on Linux (or any other specific POSIX implementation)?
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • This is an excellent answer, and I considered awarding it the bounty, but ultimately I gave Nominal Animal the nod. But thank you for your answer: the dance of making a backup file and copying data around was my second-favorite solution to this problem. And also thank you for pointing out that restricting myself to POSIX might be limiting my options here; in my specific case, Linux is of utmost importance (although compatibility with Mac OS is nice because developers use Macs for some of this software). – Daniel Pryden Mar 08 '18 at 16:49