How to cancel waitpid if child has no status change?

Question

Disclaimer: Absolute newbie in C, i was mostly using Java before.

In many C beginner tutorials, waitpid is used in process management examples to wait for its child processes to finish (or have a status change using options like WUNTRACED). However, i couldn't find any information about how to continue if no such status change occurs, either by direct user input or programmatic (e.g. timeout). So what is a good way to undo waitpid? Something like SIGCONT for stopped processes, but instead for processes delayed by waitpid.

Alternatively if the idea makes no sense, it would be interesting to know why.

[waitpid manual](https://linux.die.net/man/2/waitpid): "WNOHANG return immediately if no child has exited.". Is that what you need? — kaylum, May 24 '20 at 22:03
Well i was thinking more of user input or timeout so i thought not at first glance, but the solution by dbush uses WNOHANG in a timeout loop which could also be a wait for user input loop. Was hoping i wouldn't need to run a (potentially) infinite loop but just wait for an interrupt but it should work. Thanks! — ptstone, May 24 '20 at 22:09
You didn't make that very clear. You could handle `SIGCHLD` instead. But still not sure if that is what you really mean. — kaylum, May 24 '20 at 22:16
Sorry for being unclear. I was just looking at the tutorials and thought: Once i set a parent process to wait with waitpid, how can i undo this after the fact without ANY change to the child. So not if child does x do z, but child can do whatever, i just want to revert the waitpid on parent. — ptstone, May 24 '20 at 22:42

score 2 · Accepted Answer · edited Jun 20 '20 at 09:12

How about if I suggest using alarm()? alarm() delivers SIGALRM after the countdown passes (See alarm() man page for more details). But from the signals man page, SIGALRM default disposition is to terminate the process. So, you need to register a signal handler for handling the SIGALRM. Code follows like this...

#include <unistd.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

void sigalrm(int signo)
{
    return; // Do nothing !
}

int main()
{
    struct sigaction act, oldact;

    act.sa_handler = sigalrm;   // Set the signal handler
    sigemptyset(&act.sa_mask);
    act.sa_flags = 0;

#ifdef SA_INTERRUPT // If interrupt defined set it to prevent the auto restart of sys-call
    act.sa_flags |= SA_INTERRUPT;
#endif

    sigaction(SIGALRM, &act, &oldact);

    pid_t fk_return = fork();
    if (fk_return == 0) {   // Child never returns
        for( ; ; );
    }

    unsigned int wait_sec = 5;
    alarm(wait_sec);    // Request for SIGALRM

    time_t start = time(NULL);
    waitpid(-1, NULL, 0);
    int tmp_errno = errno;  // save the errno state, it may be modified in between function calls.
    time_t end = time(NULL);

    alarm(0);  // Clear a pending alarm
    sigaction(SIGALRM, &oldact, NULL);

    if (tmp_errno == EINTR) {
        printf("Child Timeout, waited for %d sec\n", end - start);
        kill(fk_return, SIGINT);
        exit(1);
    }
    else if (tmp_errno != 0)    // Some other fatal error
        exit(1);

    /* Proceed further */

    return 0;
}

OUTPUT

Child Timeout, waited for 5 sec

Note: You don't need to worry about SIGCHLD because its default disposition is to ignore.

EDIT

For the completeness, it is guaranteed that SIGALRM is not delivered to the child. This is from the man page of alarm()

Alarms created by alarm() are preserved across execve(2) and are not inherited by children created via fork(2).

EDIT 2

I don't know why it didn't strike me at first. A simple approach would be to block SIGCHLD and call sigtimedwait() which supports timeout option. The code goes like this...

#include <unistd.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main()
{
    sigset_t sigmask;
    sigemptyset(&sigmask);
    sigaddset(&sigmask, SIGCHLD);
    sigprocmask(SIG_BLOCK, &sigmask, NULL);

    pid_t fk_return = fork();
    if (fk_return == 0) {   // Child never returns
        for( ; ; );
    }

    if (sigtimedwait(&sigmask, NULL, &((struct timespec){5, 0})) < 0) {
        if (errno == EAGAIN) {
            printf("Timeout\n");
            kill(fk_return, SIGINT);
            exit(1);
        }
    }

    waitpid(fk_return, NULL, 0);    // Child should have terminated by now.

    /* Proceed further */

    return 0;
}

OUTPUT

Timeout

Thank you for helping me understand process handling & signal use better, and also for the examples to provide orientation. All 3 answers were really helpful & upvoted, but since this helped me most i selected is as answer. — ptstone, May 25 '20 at 04:19

score 1 · Answer 2 · answered May 24 '20 at 22:04

1

The third argument to waitpid takes a set of flags. You want to include the WNOHANG flag, which tells waitpid to return immediately if no child process has exited.

After adding this option, you would sit in a loop a sleep for some period of time and try again if nothing has exited. Repeat until either a child has returned or until your timeout has passed.

answered May 24 '20 at 22:04

dbush

205,898
23
218
273

I was hoping there was a way without countdown loop, e.g. user pressing ESC or sending interrupt but the solution should work for me. I'll try it out tomorrow and if works ill mark as solved and solution. Thank you, – ptstone May 24 '20 at 22:14
1

You should handle user input and other events as you normally would: you have some event loop that receives the input and dispatches it. All that you're adding is a poll of the PIDs you're interested in. The "countdown" loop is a normal thing, known as the event loop, except it should do more than just wait. If it's "just" a non-GUI process, you can still have an event loop - it makes life easier. – Kuba hasn't forgotten Monica May 25 '20 at 00:08
Thanks for helping me understand waitpid better. All 3 answers were really helpful & upvoted, but since i can only select one as the solution i marked the one by Mohith Reddy. – ptstone May 25 '20 at 04:16

score 1 · Answer 3 · 2020-05-25T00:00:16.343

Waiting for process to die on a typical Unix system is an absolute PITA. The portable way would be to use various signals to interrupt wait function: SIGALARM for timeout, SIGTERM/SIGINT and others for "user input" event. This relies on a global state and thus might be impossible to do.

The non-portable way would be to use pidfd_open with poll/epoll on Linux, kqueue with a EVFILT_PROC filter on BSDs.

Note that on Linux this allows waiting for a process to terminate, you will still have to retrieve status via waitid with P_PIDFD.

If you still want to mix in "user events", add signalfd to the list of descriptors on Linux or EVFILT_SIGNAL filter of kqueue on BSDs.

Another possible solution is to spawn a "process reaper" thread which is responsible for reaping of all processes and setting some event in a process object of your choice: futex word, eventfd etc. Waiting on such objects can be done with a timeout. This requires everyone to agree to use the same interface for process spawning which might or might not be reasonable. Afaik Java implementations use this strategy.

Thank you for helping me understand process handling & signal use better. All 3 answers were really helpful & upvoted, but since i can only select one solution i marked the one by Mohith Reddy. — ptstone, May 25 '20 at 04:18

How to cancel waitpid if child has no status change?

3 Answers3

EDIT

EDIT 2