1

In Linux, one can wait on any FD using select, poll or epoll. It is also possible to wait for child-processes to change state using wait, waitpid or waitid. However, I can't figure a way to combine these operations, i.e., to block the calling process until either some FD becomes ready or a child process changes state.

I can use polling, by repeatedly calling non-blocking epoll then waitid, but that is wasteful.

It is possible to create a pidfd for a child process (which is accepted by epoll), but pidfd only supports waiting for child termination, while I wish to wait for any state change (specifically, for ptrace stops).

Is this not possible in Linux?

shapaz
  • 43
  • 4
  • ptrace sends a signal to a process. You can start from the information and elaborate. Also Linux has `signalfd` mechanism to catch a signals. And off course you know about `sigaction` interface has a way of returning child process status. – user14063792468 Jun 04 '22 at 04:27
  • You could call the selector with a timeout, in a loop. when the selector returns after the timeout you can then check the process state non-blocking (e.g. waitpid(pd, WNOHANG). – Keith Jul 02 '22 at 22:49

1 Answers1

4

You can wait for any child status change with signalfd() and make dummy read, then get actual status with waitpid():

sigset_t mask, old_set;
sigemptyset(&mask);
sigaddset(&mask, SIGCHLD);
sigprocmask(SIG_BLOCK, &mask, &old_set);

int sigfd = signalfd(-1, &mask, SFD_CLOEXEC);
if (sigfd == -1) {
    perror("signalfd");
    return 1;
}

for (int i = 0; i < 10; ++i) {
    pid_t pid = fork();
    if (pid == -1) {
        perror("fork");
    }
    if (pid == 0) {
        // Child process: restore blocked signals before exec() etc
        sigprocmask(SIG_SETMASK, &old_set, NULL);
        sleep(i % 3);
        switch (i % 3) {
            case 0:
                raise(SIGSTOP);
                break;
            case 1:
                raise(SIGABRT);
                break;
        }
        exit(i);
    }
    printf("Spawned child %i with pid %u\n", i, pid);
}

for (;;) {
    struct pollfd fds[] = {
        { .fd = STDIN_FILENO, .events = POLL_IN },
        { .fd = sigfd,        .events = POLL_IN }
    };
    if (poll(fds, sizeof(fds)/sizeof(*fds), -1) == -1) {
        perror("poll");
        break;
    }

    if (fds[0].revents & POLL_IN) {
        char buf[4096];
        int ret = read(STDIN_FILENO, buf, sizeof(buf));
        printf("Data from stdin: ");
        fflush(stdout);
        write(STDOUT_FILENO, buf, ret);
    }

    if (fds[1].revents & POLL_IN)
    {
        struct signalfd_siginfo fdsi;
        read(sigfd, &fdsi, sizeof(fdsi));

        for (;;) {
            int status;
            pid_t pid = waitpid(-1, &status, WNOHANG | WUNTRACED | WCONTINUED);
            if (pid == -1) {
                if (errno != ECHILD) {
                    perror("waitpid");
                }
                break;
            }
            if (pid == 0) {
                break;
            }

            printf("Child %u ", pid);
            if (WIFEXITED(status)) {
                printf("exited with status %i\n", WEXITSTATUS(status));
            } else if (WIFSIGNALED(status)) {
                printf("terminated by signal %i\n", WTERMSIG(status));
            } else if (WIFSTOPPED(status)) {
                printf("stopped by signal %i\n", WSTOPSIG(status));
            } else if (WIFCONTINUED(status)) {
                printf("continued\n");
            } else {
                printf("status unknown\n");
            }
        }
    }
}

close(sigfd);
dimich
  • 1,305
  • 1
  • 5
  • 7
  • Why do you need to `waitpid` for signals? Haven't they been already consumed by the `read` from your `sigfd`? What's the need for your inner for-loop? – shapaz Jul 05 '22 at 07:26
  • 3
    Multiple signals with the same number aren't queued one by one but squashed into single sigmask of pending signals. `signalfd` doesn't remove child process record, as well as SIGCHLD handler call. Loop around `waitpid` is required to read multiple state changes happened between calls. – dimich Jul 05 '22 at 09:06
  • I re-read the signalfd man page but couldn't find a reference to this behaviour. Where is it documented? – shapaz Jul 07 '22 at 07:47
  • 2
    `man 7 signal`: "Standard signals do not queue. If multiple instances of a standard signal are generated while that signal is blocked, then only one instance of the signal is marked as pending" `signalfd()` works on top of kernel's signal queuing mechanism. You can see implementation in `linux/kernel/signal.c`. In `__send_signal_locked()`. `signalfd_notify()` is not called if sig < SIGRTMIN and already pending. Also see this topic: https://stackoverflow.com/questions/58772075/linux-does-not-implement-posix-signal-queuing – dimich Jul 07 '22 at 08:19
  • Thank you for this amazing answer. This is 5 separate questions answered in one. – hraban Sep 08 '22 at 18:50