0

I'm writing a SIGCHLD handler and I'm wondering under what conditions would a call to waitpid() return -1?

More specifically, if I create a loop in which I call waitpid(...) and want it to run until all terminated child processes have been reaped, would I be iterating until waitpid(...) returns -1? Otherwise, how can I know if there are any more children that require reaping?

fvgs
  • 21,412
  • 9
  • 33
  • 48
  • A signal might interrupt the waiting... – Kerrek SB Jul 05 '14 at 12:47
  • Have you tried reading the man page? It explains the reasons for returning -1. – Barmar Jul 05 '14 at 12:50
  • I have looked at the man page. Either I'm not understanding the description therein, or waitpid() returning -1 may not be what's necessary for the loop I described to have the effect of reaping all terminated child processes. In either case, I'm looking for an answer. – fvgs Jul 05 '14 at 12:54
  • Handling sigchld is not a good practice. It's dangerously incompatible with any library code that makes child processes, including popen. You should instead only waitpid for children you're aware of, from the logical module that created them. – R.. GitHub STOP HELPING ICE Jul 05 '14 at 14:48

1 Answers1

2

waitpid() can return -1 under these circumstances:

  1. The process has no children that it has not yet waited for. errno is set to ECHILD in this case. If you're looping to reap all children or all children in your process group (i.e. you set pid to -1 or 0), you should break out of the loop when this happens. This can also happen if the signal action for SIGCHLD is set to SIG_IGN or the SA_NOCLDWAIT flag is set for the signal.
  2. A problem was detected in the arguments. If the problem is with the options argument, errno will be set to EINVAL. If the pid is greater than 0 (so you're waiting for a specific child) and doesn't exist or is not a child of this process, errno will be set to ECHILD; this is probably not applicable if you're in a wait loop. These generally indicate a problem in your code, you should probably report it or log it, and exit.
  3. The call was interrupted by a signal. errno will be set to EINTR. You should probably stay in the loop when this happens.
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Ok, thanks for the answer. The first part is what had me confused since the man page only really mentions that case explicitly for wait() but not for waitpid() or waitid(). Btw, setting the pid argument to 0 will only reap process with the same group id, so won't necessarily have the same behavior as setting it to -1. – fvgs Jul 05 '14 at 13:07
  • 1
    Your man page must be different from mine. In mine, the paragraph about `waitpid` says `If there are no children not previously awaited, -1 is returned with errno set to [ECHILD].` – Barmar Jul 05 '14 at 13:12
  • I didn't say that `-1` and `0` have the same behavior, just that they don't wait for a specific child. When you wait for a specific child, `ECHILD` means the child is not valid; when you wait for any child or any child in the group, `ECHILD` means there aren't any left. – Barmar Jul 05 '14 at 13:14
  • Also `waitpid()` can return -1 if it *does* wait for a child *and* either the `sigaction` for `ECHLD` is set to `SIG_IGN` or the `SA_NOCLDWAIT` flag is set. See the manual page for `wait()`. – abligh Dec 08 '15 at 12:43
  • @abligh Thanks, added to the answer. – Barmar Dec 08 '15 at 16:10