2

I am playing around with system calls in C and I am stuck trying to understand this program I made -

int main(int argc, char* argv[])
{
int a;
char *args[]={"sleep"," 10",NULL}; 

a = fork();
int stat;


if(a==0){
    setpgid(getpid(),getpid());
    printf("%d\n",getpgid(getpid()));
    execvp(args[0],args);}
else
{
    int t2;
    waitpid(-a,&t2,0);
}

printf("Parent pid = %d\n", getpid()); 
printf("Child pid = %d\n", a); 

}

According to my understanding, I have set pgid of child as its own pid. When I call waitpid with -a as argument, I am basically asking it to wait(blocking) till any process in pgid=a is finished. However, the output of the program is not what I expected! Child process isn't being reaped at all. It is as if waitpid is in non-blocking mode. Output:

Parent pid = 11372
Child pid = 11373
11373

(The output is instantaneous, it doesn't wait for 10 seconds!)

EDIT : I added printf("Here") and exit(1) below execvp and printed out waitpid's output as suggested in comments. Here doesn't get printed and waitpid prints -1

Black Jack 21
  • 315
  • 4
  • 19
  • Are you sure your exec() call succeed? Add a call to exit(1); just after, in the case it fails... – Jean-Baptiste Yunès Feb 06 '19 at 09:43
  • 1
    First of all check what `waitpid` *returns*. And if it returns zero then check the value of `t2` to learn the exit code of the child process. And don't forget to check if `execvp` returns as well. – Some programmer dude Feb 06 '19 at 09:43
  • I tried doing as you both suggested. Control flow doesn't go below execvp and waitpid is returning -1. – Black Jack 21 Feb 06 '19 at 09:49
  • 1
    If `waitpid()` is returning -1, you should print out `errno` to see why it's failing (`perror()` or `strerror()` can be used to get a human readable error message instead of a cryptic number) – Shawn Feb 06 '19 at 10:03

1 Answers1

4

The problem is a race condition. After the fork here:

a = fork();

if the child runs first the process group -a is created here:

if(a==0){
    setpgid(getpid(),getpid());

and the parent then waits here:

waitpid(-a,&t2,0);

but if the parent gets to run first, the process group -a does not exist yet and the waitpid() fails with ECHILD. The second case obviously happens on your system.

You will have to find some way to make sure, that thesetpgid() call in the child runs before the waitpid() call in the parent. The complex way are semaphores, the easy (hacky) way is a short delay, usleep(1) before the waitpid() will probably suffice.

Ctx
  • 18,090
  • 24
  • 36
  • 51