Trying my best to answer all the sub-questions; I apologise that some of this is vaguer than it ideally should be:
If there is a
possibility that library code has
registered pthread_atfork handlers
which are not async-signal-safe, does
this negate the safety of fork?
Yes. The fork documentation explicitly mentions this:
When the application calls fork() from a signal handler and any of the
fork handlers registered by pthread_atfork() calls a function that is
not asynch-signal-safe, the behavior is undefined.
Of course, this means you can't actually use pthread_atfork()
for its intended purpose of making multi-threaded libraries transparent to processes that believe they are single-threaded, because none of the pthread synchronisation functions are async-signal-safe; this is noted as a defect in the spec, see http://www.opengroup.org/austin/aardvark/latest/xshbug3.txt (search for "L16723").
Does
the answer depend on whether the
thread in which the signal handler is
running could be in the middle of
using a resource that the atfork
handlers need? Or said a different
way, if the atfork handlers make use
of synchronization resources (mutexes,
etc.) but fork is being called from a
signal handler which executed in a
thread that never accesses these
resources, is the program conforming?
Strictly speaking the answer is no, because the according to the spec, functions are either async-signal-safe or they're not; there's no concept of "safe under certain circumstances". In practice you might well get away with it, but you would be vulnerable to a clunky-but-correct implementation that didn't partition its resources in the way you were expecting.
Building on this question, if
"thread-safe" forking is implemented
internally in the system library using
the idioms suggested by pthread_atfork
(obtain all locks in the prefork
handler and release all locks in both
the parent and child postfork
handlers), then is fork ever safe to
use from signal handlers in a threaded
program? Isn't it possible that the
thread handling the signal could be in
the middle of a call to malloc or
fopen/fclose and holding a global
lock, resulting in deadlock during
fork?
If it were implemented in that way, then you're right, fork()
from a signal handler would never be safe, because attempting to obtain a lock might deadlock if the calling thread already held it. But this implies that an implementation using such a method would not be conforming.
Looking at glibc as one example, it doesn't do that - rather, it takes two approaches: firstly, the locks that it does obtain are recursive (so if the current thread already has them, their lock count will simply be increased); further, in the child process, it simply unilaterally overwrites all the locks - see this extract from nptl/sysdeps/unix/sysv/linux/fork.c
:
/* Reset the file list. These are recursive mutexes. */
fresetlockfiles ();
/* Reset locks in the I/O code. */
_IO_list_resetlock ();
/* Reset the lock the dynamic loader uses to protect its data. */
__rtld_lock_initialize (GL(dl_load_lock));
where each of the resetlock
and lock_initialize
functions ultimately call glibc's internal equivalent of pthread_mutex_init()
, effectively resetting the mutex regardless of any waiters.
I think the theory is that, by obtaining the (recursive) lock it's guaranteed that no other threads will be touching the data structures (at least in a way that might cause a crash), and then resetting the individual locks ensures the resources aren't permanently blocked. (Resetting the current thread's lock is safe since there are now no other threads to contend for the data structure, and indeed won't be until whatever function is using the lock has returned).
I'm not 100% convinced that this covers all eventualities (not least because if/when the signal handler returns, the function that's just had its lock stolen will try to unlock it, and the internal recursive unlock function doesn't protect against unlocking too many times!) - but it seems that a workable scheme could be built on top of async-signal-safe recursive locks.
Finally, even if fork is safe in
signal handlers, is it safe to fork in
a signal handler and then return from
the signal handler, or does a call to
fork in a signal handler always
necessitate a subsequent call to _exit
or one of the exec family of functions
before the signal handler returns?
I assume you're talking about the child process? (If fork()
being async-signal-safe means anything then it should be possible to return in the parent!)
Not having found anything in the spec that states otherwise (though I may have missed it) I believe it should be safe - at least, 'safe' in the sense that returning from the signal handler in the child doesn't imply undefined behaviour in and of itself, though the fact that a multi-threaded process has just forked may imply that an exec*()
or _exit()
is probably the safest course of action.