In the Linux kernel, why does the SIGCHLD signal not interrupt the wait() system call?

Question

Consider a scenario like this: The parent process calls wait() to wait for the child process to exit, and the signal handler is registered for SIGCHLD. When the parent process blocks at wait(), the child process ends, at which point the parent process receives a SIGCHLD signal (regardless of setting special fields).

After I tested, I found that wait() was not interrupted by the SIGCHLD` signal to fail and return -1, but returned successfully after executing the signal processing function. Why is that?

How, specifically, is the signal handler registered? What `sigaction()` or `signal()` call is used? — John Bollinger, Feb 13 '23 at 15:54

Erdal Küçük · Answer 1 · 2023-02-13T09:32:39.020

1

man wait

ERRORS

EINTR: WNOHANG was not set and an unblocked signal or a SIGCHLD was caught

Since you've established a signal handler for SIGCHLD, wait does not get interrupted.

For more info, see: signal, especially:

Waiting for a signal to be caught
Synchronously accepting a signal
Signal mask and pending signals

A signal may be blocked, which means that it will not be delivered until it is later unblocked. ...
Execution of signal handlers
Interruption of system calls and library functions by signal handlers
If a signal handler is invoked while a system call or library function call is blocked, then either:
- the call is automatically restarted after the signal handler returns; or
- the call fails with the error EINTR.
Which of these two behaviors occurs depends on the interface and whether or not the signal handler was established using the SA_RESTART flag (see sigaction(2)). The details vary across UNIX systems; below, the details for Linux.

If a blocked call to one of the following interfaces is interrupted by a signal handler, then the call is automatically restarted after the signal handler returns if the SA_RESTART flag was used; otherwise the call fails with the error EINTR:
- wait

edited Feb 13 '23 at 09:32

answered Feb 13 '23 at 07:15

Erdal Küçük

4,810
1
6
11

”then the call is automatically restarted after the signal handler returns if the SA_RESTART flag was used“，but I didn't set the SA_RESTART field. Perhaps the Linux implementation implicitly restarts the wait() in this case? – meowrainy Feb 13 '23 at 08:36
@meowrainy Do you use `sigaction`? What do you specify in the `sa_flags` field? Furthermore: _"depends on the interface and whether or not "_, i am not sure. Based on your description, it seems, that `wait` gets restarted (assuming you didn't set the `SA_RESTART` flag). – Erdal Küçük Feb 13 '23 at 09:27
@meowrainy `signal(7)` mentions, that the behaviour can vary between unix systems, but for linux,it states that the SA_RESTART flag gets considered (as can be read above, i've added the relevant section - in bold). It seems further research is needed. – Erdal Küçük Feb 13 '23 at 09:32
Your quote from the `wait()` manual page does not support your claim "Since you've established a signal handler for `SIGCHLD`, `wait` does not get interrupted." On the contrary, it suggests that `wait()` would be interrupted by a `SIGCHLD` *even if that signal were blocked*. The rest of this manual is relevant, but also does not explain the specific behavior the OP asks about. – John Bollinger Feb 13 '23 at 13:12
Also, I can replicate the OP's reported behavior both with `SA_RESTART` in effect and without, and I can confirm that on Linux, `wait()` *does* respond to `SIGCHLD` even when that signal is blocked (but the handler, if one is registered, does not run if the signal is blocked). – John Bollinger Feb 13 '23 at 18:41

John Bollinger · Answer 2 · 2023-02-13T18:32:53.143

After I tested, I found that wait() was not interrupted by the SIGCHLD` signal to fail and return -1, but returned successfully after executing the signal processing function. Why is that?

Well, if the signal handler ran while the thread was blocked in wait() then that call was interrupted. I guess the question is why wait() then went ahead with collecting the child and returned successfully instead of failing with EINTR.

I can reproduce that behavior. The specifics of how you register the handler are unclear, but in my tests I see the handler running and wait() thereafter returning successfully even when the SA_RESTART flag is not set for SIGCHLD, which is generally a major factor in whether restartable system calls such as wait() fail with EINTR when interrupted by a signal.

I'm having trouble locating any documentation that specifically prescribes the observed combination of results for wait() + handler function + SIGCHLD, but the bottom line is that SIGCHLD is special. In particular, it has a special relationship with wait(), because the events that a system-generated SIGCHLD reports on are exactly the ones that a blocking wait() call is waiting for. Some of the manifestations of that specialness are

The sigaction() function defines two flags modulating behavior related specifically to SIGCHLD, and none specific to any other signal.
Even though the default disposition of SIGCHLD is documented as SIG_IGN, its actual default behavior is unique to that signal and distinct from the behavior obtained by explicitly setting the disposition to SIG_IGN.
POSIX has special provisos for the behavior of the wait-family functions, as described in the notes in the wait() manual page, about how these functions are affected by the disposition and flags associated with SIGCHLD.

I don't think either POSIX or Linux explicitly says so, but it all comes around to a pending SIGCHLD being how the wait-family functions recognize that there is a child to collect. POSIX is sufficiently unspecific that I think other POSIX systems could do it differently, but to the best of my knowledge, using SIGCHLD for this purpose is both traditional and what Linux does. Enough so that signal-handling behavior is specifically designed to accommodate the common behavior of using wait() inside a handler for SIGCHLD to provide for central processing of terminated children.

It is also notable that wait() will collect the child and clear the pending SIGCHLD even if that signal is blocked, analogously to how sigwait() will receive blocked signals. In that case, any registered handler is bypassed.

Your case of establishing a handler for SIGCHLD that does not collect the status information for the child is unusual, but consider what needs to happen here:

a SIGCHLD has been received, it is not blocked, and a signal handler has been registered for it, so the signal handler must run and the SIGCHLD must be removed from the pending list.
after your particular handler runs, the status information for the child has not yet been consumed, so it must be consumed when control returns to wait(). Otherwise, it can never be consumed and reported, for receipt of a SIGCHLD is how the system is triggered to do that, and the context in which the status information is delivered.

I anticipate that your wait() would fail with either ECHILD or EINTR if the signal handler collected the waited-for child via its own wait() call. Which one depends in part on whether the SA_RESTART flag is set for SIGCHLD. I anticipate that it would fail with EINTR if there was a running child, and the wait() was interrupted by a synthetic SIGCHLD, and the SA_RESTART flag was not set.

In the Linux kernel, why does the SIGCHLD signal not interrupt the wait() system call?

2 Answers2