1

I tried to find an answer to my question at this post: Signal handler and waitpid coexisting but for me isn't very clear at the moment.

I try to explain my problems:

I'm trying to write a C program that concerns IPC between a parent process and its children. The parent process creates N child processes, then it waits for the termination in a loop like this:

while((pid_term = waitpid(-1, &status, 0)) != -1)

After X seconds, parent receives SIGALRM, then with the sigaction system call, it catches the alarm:

struct sigaction act;
act.sa_handler = alarmHandler;
sigemptyset(&act.sa_mask);

act.sa_flags = 0;
sigaction(SIGALRM, &act, NULL);  

But, when the handler function returns, the waitpid also returns -1, and the parent process exits from the while loop above. At the moment, the handler function has an empty body.

I ask myself what happened — why did waitpid() return -1 after the handler invocation even though most of the children are still alive? Why doesn't this happen with signal() function?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
gaet
  • 11
  • 1
  • 3
    I don’t know why this doesn’t happen with the `signal` function unless you just use that to ignore the signal, but when a system call like `waitpid` exits with `-1` you should check `errno`, which for `waitpid` can be [one of these values](http://man7.org/linux/man-pages/man2/waitpid.2.html#ERRORS). In this case, I strongly expect the result will be `EINTR`, the system call was interrupted by a signal (in this case your `SIGALRM`). You need to re-start the `waitpid` system call; the program can’t automatically jump back into it when the signal handler returns. – Daniel H Dec 27 '17 at 17:25
  • The `signal()` function on different systems does things differently, depending on the legacy of that system. The standards allow great flexibility, but almost no control, over what happens. Using `sigaction()` allows for basically all of the legacy operations, but allows you complete control over what happens. If you want dependable behaviour, use `sigaction()` and be specific. – Jonathan Leffler Dec 27 '17 at 17:55

1 Answers1

3

The default behavior of signal handlers established by sigaction is to interrupt blocking system calls; if you check errno after the alarm fires you should observe it to be set to EINTR. This behavior is almost never what you want; it's only the default for backward compatibility's sake. You can make it not do this by setting the SA_RESTART bit in sa_flags:

struct sigaction act;
act.sa_flags = SA_RESTART;
act.sa_handler = alarmHandler;
sigemptyset(&act.sa_mask);
sigaction(SIGALRM, &act, 0);

One of the most important reasons to use sigaction instead of signal, is that when you use signal it is unpredictable whether or not the signal handler will interrupt blocking system calls. (The System V lineage picked one semantic and the BSD lineage picked the other.)

zwol
  • 135,547
  • 38
  • 252
  • 361
  • to zwol and Daniel H. You've been precious!! And you have avoided my madness!! Thanks very much, it works!! – gaet Dec 27 '17 at 17:35