3

My program does the following in chronological order

  1. The program is started with root permissions.
  2. Among other tasks, A file only readable with root permissions is open()ed.
  3. Root privileges are dropped.
  4. Child processes are spawned with clone() and the CLONE_FILES | CLONE_FS | CLONE_IO flags set, which means that while they use separate regions of virtual memory, they share the same file descriptor table (and other IO stuff).
  5. All child processes execve() their own programs (the FD_CLOEXEC flag is not used).
  6. The original program terminates.

Now I want every spawned program to read the contents of the aforementioned file, but after they all have read the file, I want it to be closed (for security reasons).

One possible solution I'm considering now is having a step 3a where the fd of the file is dup()licated once for every child process, and each child gets its own fd (as an argv). Then every child program would simply close() their fd, so that after all fds pointing to the file are close()d the "actual file" is closed.

But does it work that way? And is it safe to do this (i.e. is the file really closed)? If not, is there another/better method?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Will
  • 2,014
  • 2
  • 19
  • 42
  • 1
    If both the parent and the children all close their fd, then the file is closed. Is there any reason to use `clone`? A simple `fork` would exhibit much the same behavior, I think. – Fred Foo Jun 02 '13 at 09:41
  • `If both the parent and the children all close their fd, then the file is closed.` That would be good news, thanks. `Is there any reason to use clone? A simple fork would exhibit much the same behavior, I think.` True, but the reason I use `clone()` is related to performance: all child programs need to handle the same really large set of sockets (through `epoll`). Sharing the fd table avoids unnecessary overhead. The reason I mention that in the question, is because I'm not 100% that this detail is irrelevant. – Will Jun 02 '13 at 09:48
  • Is it an option to read(/parse) the file in the parent process and store the result in memory shared with `mmap` as step 3a? That would skip the issue of communicating the fd to the child processes and adhere to the principle of dropping privileges as soon as possible. – Fred Foo Jun 02 '13 at 09:57
  • Yes, I considered that, but because the content of the file is sensitive (an SSL private key), storing it in shared memory seems like a bad idea to me considering that that memory would be accessible by other processes (at least for a very short time), since the shared memory would need to be created *after* dropping root (or else the `execve()`d children won't be able to access it). – Will Jun 02 '13 at 10:06
  • I don't quite follow. The fd is now also visible by other processes (the children). I understand your predicament, but it seems that in your current setup, no matter how it's implemented, you'll have to share a secret with the child processes and trust them to wipe it before executing a different program. – Fred Foo Jun 02 '13 at 10:15
  • Sorry, I mean, if I used shared memory, won't that memory be also accessible by processes other than the children (I can't use `MAP_PRIVATE` because of the `execve()` calls)? Or are you saying that's true for file descriptors too? – Will Jun 02 '13 at 10:20
  • Ehm, you're right, thinko. You'd have to unmap before the exec, just like you have to `close` now. – Fred Foo Jun 02 '13 at 10:24
  • @larsmans I've added a preliminary answer (and a +1 to your first comment because it got me thinking in the right direction (probably)). – Will Jun 03 '13 at 03:56

1 Answers1

2

While using dup() as I suggested above is probably just fine, I've now --a day after asking this SO question-- realized that there is a nicer way to do this, at least from the point of view of thread safety.

All dup()licated file descriptors point to the same same file position indicator, which of course means you run into trouble when multiple threads/processes might simultaneously try to change the file position during read operations (even if your own code does so in a thread safe way, the same doesn't necessarily go for libraries you depend on).

So wait, why not just call open() multiple times (once for every child) on the needed file before dropping root? From the manual of open():

A call to open() creates a new open file description, an entry in the system-wide table of open files. This entry records the file offset and the file status flags (modifiable via the fcntl(2) F_SETFL operation). A file descriptor is a reference to one of these entries; this reference is unaffected if pathname is subsequently removed or modified to refer to a different file. The new open file description is initially not shared with any other process, but sharing may arise via fork(2).

Could be used like this:

int fds[CHILD_C];
for (int i = 0; i < CHILD_C; i++) {
    fds[i] = open("/foo/bar", O_RDONLY);
    // check for errors here
}
drop_privileges();
// etc

Then every child gets a reference to one of those fds through argv and does something like:

  1. FILE *stream = fdopen(atoi(argv[FD_STRING_I]), "r")
  2. read whatever needed from the stream
  3. fclose(stream) (this also closes the underlying file descriptor)

Disclaimer: According to a bunch of tests I've run this is indeed safe and sound. I have however only tested open()ing with O_RDONLY. Using O_RDWR or O_WRONLY may or may not be safe.

Will
  • 2,014
  • 2
  • 19
  • 42