2

I have this code:

#include <signal.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/wait.h>
#include <stdlib.h>
int cpt = 0;
void handler (int sig) {
     cpt++ ;
}
int main() {
  int i;
  signal(SIGCHLD, handler);
  for (i = 0; i < 5; i++) {
       if (fork() == 0) {
           exit(0);
       }
   }
   while (wait(NULL) != -1) ;
   printf("cpt = %d\n", cpt);
   return 0;
}

this program to my understanding should always print cpt = 5 but when i run it on my machine it returns different values (3,4,5) why is that?

Rafik Bouloudene
  • 565
  • 3
  • 13
  • https://en.cppreference.com/w/c/program/signal - the "signal handler" section explains the constraints you face in a signal hander. – Mat Jun 03 '22 at 12:25
  • I believe that `SIGCHLD` is a little different from other signals, as it's level triggered and not edge triggered: that means that it delivers the signal if there are any unwaited for children, and if the `while (wait(...))` loop happens to consume two children in a row, that's one less `SIGCHLD` delivered. – Steve Friedl Jun 03 '22 at 12:30
  • `cpt` is of bad type, see [here](https://stackoverflow.com/questions/42182435/volatile-for-signal-handler-and-multi-threading). Apart from, incrementing is not an atomic operation, you actually need to load, increment and store the value – if signals can preempt one another then it might possibly occur that one signal handler loads the value, gets preempted by another one loading the same value, incrementing and storing it and finally previous signal handler continues based upon outdated value, eliding the increment of the preempting handler call... – Aconcagua Jun 03 '22 at 12:33
  • About [signals preempting one another](https://stackoverflow.com/questions/15651964/linux-can-a-signal-handler-excution-be-preempted) – you should rather use `sigaction`, by the way. – Aconcagua Jun 03 '22 at 12:40
  • `SIGCHLD` is not one of the signals that pre-empts itself, and though it's wise to use the proper `volatile sig_atomic_t` type, neither of these explains the surprising behavior. It's that the while loop is consuming the child processes before the signal handler can be dispatched every time. – Steve Friedl Jun 03 '22 at 12:43

1 Answers1

1

The SIGCHLD signal is a little funny and doesn't work like you'd expect: we think we should get one signal per child death, but that's not it.

Instead, it's a kind of level-triggered thing where at some unknown intervals it sends the signal if there are any un-waited-for children.

In the loop you provided that burns through the wait(), this loop is consuming multiple children before the signal handler gets around to it, hence less trips through the handler.

Others have pointed out that you should be using a volatile sig_atomic_t variable, and though this is a good idea, it's not why you're seeing this behavior.

I believe the only way to get a guaranteed one-signal-per-child is to actually wait for the child in the signal handler - this makes it appear more like an edge-triggered signal.

Of course, you're pretty limited to what you can do in the signal handler, so if your application already has a good regimen for waiting for child processes, you likely don't need a SIGCHLD handler.

#include <signal.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/wait.h>
#include <stdlib.h>

static volatile sig_atomic_t cpt = 0;

static void handler(int sig) {
     cpt++;
     wait(NULL);  // ADD ME
}

int main() {
  int i;
  signal(SIGCHLD, handler);
  for (i = 0; i < 5; i++) {
       if (fork() == 0) {
           exit(0);
       }
   }
   while (wait(NULL) != -1) ;
   printf("cpt=%d\n", cpt);
   return 0;
}

As an alternative, if the while() loop were not so tight and had other processing (or even an explicit delay), there would not be a race condition and you'd see all five SIGCHLD delivered.

Steve Friedl
  • 3,929
  • 1
  • 23
  • 30
  • when you say un-waited-for children you mean the parent process hasn't acknowledged his signal yet? – Rafik Bouloudene Jun 03 '22 at 12:48
  • Yes: `SIGCHLD` does not mean "exactly one child has died", it means "there is at least one child you've not waited for yet", so your `wait()` loop is outrunning the signal handler. – Steve Friedl Jun 03 '22 at 12:50
  • just to make sure i got it right, when the parent process is executing the while loop it will receive the first signal but won't be able to execute the handler (since it's executing the loop) and during that time the second child ends also, is that right? – Rafik Bouloudene Jun 03 '22 at 13:03
  • I have not been able to find the proper technical description for what's going, but I believe that `SIGCHLD` delivery happens only every so often, so if a child process dies and is *immediately* picked up by the `wait()` loop, then no signal will be delivered for it (hence: a race condition). – Steve Friedl Jun 03 '22 at 13:05