247

Suppose I have a process which spawns exactly one child process. Now when the parent process exits for whatever reason (normally or abnormally, by kill, ^C, assert failure or anything else) I want the child process to die. How to do that correctly?


Some similar question on stackoverflow:


Some similar question on stackoverflow for Windows:

Community
  • 1
  • 1
Paweł Hajdan
  • 18,074
  • 9
  • 49
  • 65

24 Answers24

206

Child can ask kernel to deliver SIGHUP (or other signal) when parent dies by specifying option PR_SET_PDEATHSIG in prctl() syscall like this:

prctl(PR_SET_PDEATHSIG, SIGHUP);

See man 2 prctl for details.

Edit: This is Linux-only

qrdl
  • 34,062
  • 14
  • 56
  • 86
  • Yeah, Linux-specific. I don't think there is POSIX way to do it. – qrdl Nov 12 '08 at 16:19
  • @qrdl What is the FreeBSD equivalent ? – Good Person May 20 '14 at 06:13
  • @GoodPerson I have no idea, never tried any BSD systems – qrdl May 20 '14 at 09:54
  • 10
    This is a poor solution because the parent might have died already. Race condition. Correct solution: http://stackoverflow.com/a/17589555/412080 – Maxim Egorushkin Dec 22 '15 at 17:49
  • 28
    Calling an answer poor isn't very nice - even if it doesn't address a race condition. See [my answer](http://stackoverflow.com/a/36945270/427158) on how to use `prctl()` in a race-condition free way. Btw, the answer linked by Maxim is incorrect. – maxschlepzig Apr 29 '16 at 18:36
  • 6
    This is just a wrong anser. It will send the signal to the child process at the time when the thread which called fork dies, not when the parent process dies. – Lothar Dec 10 '16 at 01:01
  • 2
    @Lothar It would be nice to see some kind of proof. `man prctl` says: Set the parent process death signal of the calling process to arg2 (either a signal value in the range 1..maxsig, or 0 to clear). This is the signal that the calling process will get when its parent dies. This value is cleared for the child of a fork(2) and (since Linux 2.4.36 / 2.6.23) when executing a set-user-ID or set-group-ID binary. – qrdl Dec 10 '16 at 19:46
  • @maxschlepzig I can't find the answer from the link – rox Apr 26 '17 at 09:55
  • @rox - I don't follow. Doesn't work [the link to my answer](http://stackoverflow.com/a/36945270) for you? – maxschlepzig Apr 26 '17 at 19:40
  • 1
    @maxschlepzig Thanks for the new link. Seems like the previous link is invalid. By the way, after years, there's still no api for setting options on the parent side. What a pity. – rox Apr 27 '17 at 03:47
  • 1
    This signals the child if the parent *thread* dies. If you spawned the child from a temporary thread then it might die prematurely. I found this out the hard way. I think the safest way, assuming you wrote the child code is to poll the parent pid and if it's not there then quit. – locka Aug 17 '20 at 20:04
  • Is there a way to set PDEATHSIG when starting children via posix_spawn() ? – patraulea Jan 14 '21 at 18:20
  • @patraulea It is responsibility of the child to request the signal to be delivered, regardless of how this child was created - using `fork` or `posix_spawn` – qrdl Jan 14 '21 at 18:50
74

I'm trying to solve the same problem, and since my program must run on OS X, the Linux-only solution didn't work for me.

I came to the same conclusion as the other people on this page -- there isn't a POSIX-compatible way of notifying a child when a parent dies. So I kludged up the next-best thing -- having the child poll.

When a parent process dies (for any reason) the child's parent process becomes process 1. If the child simply polls periodically, it can check if its parent is 1. If it is, the child should exit.

This isn't great, but it works, and it's easier than the TCP socket/lockfile polling solutions suggested elsewhere on this page.

Schof
  • 6,329
  • 5
  • 28
  • 38
  • 8
    Excellent solution. Continuesly invoking getppid() until it returns 1 and then exit. This is good and I now use it too. A non-pollig solution would be nice though. Thank you Schof. – neoneye Apr 11 '10 at 11:58
  • 12
    Just for info, on Solaris if you're in a zone, the `gettpid()` does not become 1 but gets the `pid` of the zone scheduler (process `zsched`). – Patrick Schlüter Oct 14 '10 at 13:32
  • 4
    If anyone is wondering, in Android systems pid seems to be 0 (process System pid) instead of 1, when parent dies. – Rui Marques Oct 02 '12 at 17:38
  • 1
    Beautiful....i made question before finding this answer, and if you want to post it please do so I will give you check. http://stackoverflow.com/q/15275983/960086 – Juan Mar 07 '13 at 17:08
  • On Linux the parent PID remains the same when the parent is killed. – Shalom Crown Oct 18 '16 at 04:42
  • @neoneye, what's the interest of having a process just continously invoking `getppid()` to `exit()` once it gets a different result? Shouldn't it be possible to just call it when neccessary? Shouldn't it be better to call `exit()` in the child process immediately and not get a system crowded of processes that only wait for something to just call `exit()`? – Luis Colorado Mar 08 '17 at 08:25
  • 4
    To have a more robust and platform independent way of doing it, before fork()-ing, simply getpid() and if getppid() from child is different, exit. – Sebastien Aug 10 '17 at 13:42
  • 3
    This doesn't work if you don't control the child process. For instance, I'm working on a command that wraps find(1), and I want to make sure the find is killed if the wrapper dies for some reason. – Lucretiel Jun 19 '18 at 02:03
  • But is polling really required. The parent can keep a track of child process ids which are returned during the process creation from fork. The parent will exit only on grace full completion of the task or because of signal. In both the cases the a signal can be sent to the child process. And the child can terminate gracefully on catching the signal. This will ensure that the response is immediate. in case of polling the cpu is wasted and the child might be alive for a longer time than the parent. further if you are using the sleep to poll then sleep is not exactly accurate. – Darshan b Mar 05 '19 at 09:46
  • @Darshanb, if you read the question carefully, there is such sentence: _normally or abnormally, by kill, ^C, assert failure or anything else_. Quite obviously, you can't guarantee graceful termination in such circumstances. – Ternvein Feb 09 '21 at 11:42
  • With all the container and virutalisation methods that exist or might come in the future, it's not save to use this implementation hack from old unix systems. – Lothar Feb 09 '21 at 13:20
39

Under Linux, you can install a parent death signal in the child, e.g.:

#include <sys/prctl.h> // prctl(), PR_SET_PDEATHSIG
#include <signal.h> // signals
#include <unistd.h> // fork()
#include <stdio.h>  // perror()

// ...

pid_t ppid_before_fork = getpid();
pid_t pid = fork();
if (pid == -1) { perror(0); exit(1); }
if (pid) {
    ; // continue parent execution
} else {
    int r = prctl(PR_SET_PDEATHSIG, SIGTERM);
    if (r == -1) { perror(0); exit(1); }
    // test in case the original parent exited just
    // before the prctl() call
    if (getppid() != ppid_before_fork)
        exit(1);
    // continue child execution ...

Note that storing the parent process id before the fork and testing it in the child after prctl() eliminates a race condition between prctl() and the exit of the process that called the child.

Also note that the parent death signal of the child is cleared in newly created children of its own. It is not affected by an execve().

That test can be simplified if we are certain that the system process who is in charge of adopting all orphans has PID 1:

pid_t pid = fork();
if (pid == -1) { perror(0); exit(1); }
if (pid) {
    ; // continue parent execution
} else {
    int r = prctl(PR_SET_PDEATHSIG, SIGTERM);
    if (r == -1) { perror(0); exit(1); }
    // test in case the original parent exited just
    // before the prctl() call
    if (getppid() == 1)
        exit(1);
    // continue child execution ...

Relying on that system process being init and having PID 1 isn't portable, though. POSIX.1-2008 specifies:

The parent process ID of all of the existing child processes and zombie processes of the calling process shall be set to the process ID of an implementation-defined system process. That is, these processes shall be inherited by a special system process.

Traditionally, the system process adopting all orphans is PID 1, i.e. init - which is the ancestor of all processes.

On modern systems like Linux or FreeBSD another process might have that role. For example, on Linux, a process can call prctl(PR_SET_CHILD_SUBREAPER, 1) to establish itself as system process that inherits all orphans of any of its descendants (cf. an example on Fedora 25).

maxschlepzig
  • 35,645
  • 14
  • 145
  • 182
  • I don't understand "That test can be simplified if we are certain that the grandparent is always the init process ". When a parent process dies, a process becomes a child of the init process (pid 1), not a child of the grandparent, right? So the test always seems to be correct. – Johannes Schaub - litb Nov 26 '16 at 23:53
  • 1
    @JohannesSchaub-litb, it doesn't have to be PID 1 - POSIX specifies: [The parent process ID of all of the existing child processes and zombie processes of the calling process shall be set to the process ID of an implementation-defined system process. That is, these processes shall be inherited by a special system process.](http://pubs.opengroup.org/onlinepubs/9699919799/functions/_Exit.html#tag_16_01_03_01) For example, when running on a Fedora 25 system in a Gnome terminal, the special system process has PID != 1: https://gist.github.com/gsauthof/8c8406748e536887c45ec14b2e476cbc – maxschlepzig Nov 27 '16 at 09:44
  • interesting, thanks. although, I fail to see what it has to do with the grandparent. – Johannes Schaub - litb Nov 28 '16 at 07:37
  • @JohannesSchaub-litb, the grandparent sentence was sloppy - probably wanted to start with an example at that time. I updated my answer to be more generic and more explicit about the general re-parenting logic. – maxschlepzig Nov 28 '16 at 18:54
  • 1
    @JohannesSchaub-litb, you cannot always assume that the granparent of a process will be `init(8)` process.... the only thing you can assume is that when a parent process dies, is that its parent id will change. This actually happens once in the life of a process.... and is when the process' parent dies. There's only one main exception to this, and is for `init(8)` children, but you are protected from this, as `init(8)` never `exit(2)` (kernel panics in that case) – Luis Colorado Mar 08 '17 at 08:31
  • 2
    Unfortunately, if a child forks from a thread, and then the thread exit, the child process wil get the SIGTERM. – rox May 04 '17 at 13:40
  • In the commented line, `// continue child execution`, even if do `execve` to execute a bash script, then also it will work(the same piece of code in the example)? – y_159 Jan 02 '21 at 16:40
  • @y_159 yes, I've covered `execve` in my answer. And bash isn't special here. – maxschlepzig Jan 02 '21 at 20:50
  • @maxschlepzig when that signal is delivered to the child when it's parent dies, the child will automatically die too? – y_159 Jan 03 '21 at 10:56
  • @y_159 depends on the signal you specify with the `prctl()` call. With `SIGTERM` (which I used in the example) the default action is to terminate. But a process could modify that action, e.g. it could even ignore it. You may also use `SIGKILL` there which can't be caught/ignored. – maxschlepzig Jan 03 '21 at 11:08
  • ok, thanks, I've used `SIGTERM` only as mentioned in the example, in the same way, I've not written explicitly in the child's code(a bash script in my case) to ignore any signal, so by default, it should terminate only? – y_159 Jan 03 '21 at 11:17
  • 1
    @y_159 yes, it should. – maxschlepzig Jan 03 '21 at 15:28
33

I have achieved this in the past by running the "original" code in the "child" and the "spawned" code in the "parent" (that is: you reverse the usual sense of the test after fork()). Then trap SIGCHLD in the "spawned" code...

May not be possible in your case, but cute when it works.

dmckee --- ex-moderator kitten
  • 98,632
  • 24
  • 142
  • 234
  • 2
    The huge problem with doing the work in the parent is that you are changing the parent process. In case of a server that has to run "forever", that's not an option. – Alexis Wilke Mar 24 '15 at 04:34
32

If you're unable to modify the child process, you can try something like the following:

int pipes[2];
pipe(pipes)
if (fork() == 0) {
    close(pipes[1]); /* Close the writer end in the child*/
    dup2(pipes[0], STDIN_FILENO); /* Use reader end as stdin (fixed per  maxschlepzig */
    exec("sh -c 'set -o monitor; child_process & read dummy; kill %1'")
}

close(pipes[0]); /* Close the reader end in the parent */

This runs the child from within a shell process with job control enabled. The child process is spawned in the background. The shell waits for a newline (or an EOF) then kills the child.

When the parent dies--no matter what the reason--it will close its end of the pipe. The child shell will get an EOF from the read and proceed to kill the backgrounded child process.

Phil Rutschman
  • 552
  • 5
  • 6
  • 3
    Nice, but five system calls, and a sh spawned in ten lines of codes lets me a bit sceptical about this piece of code performances. – Oleiade Jan 17 '14 at 09:53
  • 1
    +1. You can avoid the `dup2` and taking over stdin by using the `read -u` flag to read from a specific file descriptor. I also added a `setpgid(0, 0)` in the child to prevent it from exiting when pressing ^C in the terminal. – Greg Hewgill May 01 '14 at 00:09
  • 1
    The argument order of the `dup2()` call is wrong. If you want to use `pipes[0]` as stdin you have to write `dup2(pipes[0], 0)` instead of `dup2(0, pipes[0])`. It is`dup2(oldfd, newfd)` where the call closes a previously open newfd. – maxschlepzig Apr 30 '16 at 08:23
  • @Oleiade, I agree, especially since the spawned sh does just another fork to execute the real child process ... – maxschlepzig Apr 30 '16 at 08:26
  • After the call to `dup2()`, you should close `pipes[0]` too. – Jonathan Leffler Sep 28 '22 at 16:34
14

Inspired by another answer here, I came up with the following all-POSIX solution. The general idea is to create an intermediate process between the parent and the child, that has one purpose: Notice when the parent dies, and explicitly kill the child.

This type of solution is useful when the code in the child can't be modified.

int p[2];
pipe(p);
pid_t child = fork();
if (child == 0) {
    close(p[1]); // close write end of pipe
    setpgid(0, 0); // prevent ^C in parent from stopping this process
    child = fork();
    if (child == 0) {
        close(p[0]); // close read end of pipe (don't need it here)
        exec(...child process here...);
        exit(1);
    }
    read(p[0], 1); // returns when parent exits for any reason
    kill(child, 9);
    exit(1);
}

There are two small caveats with this method:

  • If you deliberately kill the intermediate process, then the child won't be killed when the parent dies.
  • If the child exits before the parent, then the intermediate process will try to kill the original child pid, which could now refer to a different process. (This could be fixed with more code in the intermediate process.)

As an aside, the actual code I'm using is in Python. Here it is for completeness:

def run(*args):
    (r, w) = os.pipe()
    child = os.fork()
    if child == 0:
        os.close(w)
        os.setpgid(0, 0)
        child = os.fork()
        if child == 0:
            os.close(r)
            os.execl(args[0], *args)
            os._exit(1)
        os.read(r, 1)
        os.kill(child, 9)
        os._exit(1)
    os.close(r)
Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • Note that a while back, under IRIX, I used a parent/child scheme where I had a pipe between both and reading from the pipe generated a SIGHUP if either one died. That was the way I used to kill my fork()'ed children, without the need for an intermediate process. – Alexis Wilke Mar 24 '15 at 04:11
  • 3
    I think your second caveat is wrong. The pid of a child is a resource belonging to its parent and it can't be freed/reused until the parent (the intermediate process) waits on it (or terminates and lets init wait on it). – R.. GitHub STOP HELPING ICE Feb 28 '17 at 00:28
14

For completeness sake. On macOS you can use kqueue:

void noteProcDeath(
    CFFileDescriptorRef fdref, 
    CFOptionFlags callBackTypes, 
    void* info) 
{
    // LOG_DEBUG(@"noteProcDeath... ");

    struct kevent kev;
    int fd = CFFileDescriptorGetNativeDescriptor(fdref);
    kevent(fd, NULL, 0, &kev, 1, NULL);
    // take action on death of process here
    unsigned int dead_pid = (unsigned int)kev.ident;

    CFFileDescriptorInvalidate(fdref);
    CFRelease(fdref); // the CFFileDescriptorRef is no longer of any use in this example

    int our_pid = getpid();
    // when our parent dies we die as well.. 
    LOG_INFO(@"exit! parent process (pid %u) died. no need for us (pid %i) to stick around", dead_pid, our_pid);
    exit(EXIT_SUCCESS);
}


void suicide_if_we_become_a_zombie(int parent_pid) {
    // int parent_pid = getppid();
    // int our_pid = getpid();
    // LOG_ERROR(@"suicide_if_we_become_a_zombie(). parent process (pid %u) that we monitor. our pid %i", parent_pid, our_pid);

    int fd = kqueue();
    struct kevent kev;
    EV_SET(&kev, parent_pid, EVFILT_PROC, EV_ADD|EV_ENABLE, NOTE_EXIT, 0, NULL);
    kevent(fd, &kev, 1, NULL, 0, NULL);
    CFFileDescriptorRef fdref = CFFileDescriptorCreate(kCFAllocatorDefault, fd, true, noteProcDeath, NULL);
    CFFileDescriptorEnableCallBacks(fdref, kCFFileDescriptorReadCallBack);
    CFRunLoopSourceRef source = CFFileDescriptorCreateRunLoopSource(kCFAllocatorDefault, fdref, 0);
    CFRunLoopAddSource(CFRunLoopGetMain(), source, kCFRunLoopDefaultMode);
    CFRelease(source);
}
neoneye
  • 50,398
  • 25
  • 166
  • 151
  • You can do this with a slightly nicer API, using dispatch sources with DISPATCH_SOURCE_PROC and PROC_EXIT. – russbishop Jul 07 '18 at 04:44
  • For whatever reason, this is causing my Mac to panic. Running a process with this code has a 50% chance or so of it freezing, causing the fans to spin at a rate I've never heard them go before (super fast), and then the mac just shuts off. **BE VERY CAREFUL WITH THIS CODE**. – Qix - MONICA WAS MISTREATED May 12 '19 at 17:08
  • It seems like on my macOS, the child process exits automatically after parent exits. I don't know why. – Yi Lin Liu Jun 28 '19 at 02:10
  • @YiLinLiu iirc I used `NSTask` or posix spawn. See the `startTask` function in my code here: https://github.com/neoneye/newton-commander-browse/blob/master/Classes/NCWorkerThread.m – neoneye Jun 28 '19 at 08:24
  • @russbishop - I tried your suggestion of using a dispatch source, but it did not work for me. Here's a gist with the code I tried: https://gist.github.com/jdv85/5a67ae81247f21433044b0ffea404693 The event handler block does not run. Using `kqueue` as in the answer from @neoneye works fine. – Jonas Due Vesterheden May 27 '21 at 11:07
  • @JonasDueVesterheden If you check the libdispatch sources it is using a kqueue under the covers so it should be identical. You don't appear to be retaining the source anywhere so it would get released and cancelled when the function exits. If you capture the source in the handler block it would resolve that problem. – russbishop Jun 07 '21 at 19:07
  • 1
    isn't `CFRunLoopRun();` missing after `CFRelease(source);` ? similar to https://developer.apple.com/documentation/corefoundation/cffiledescriptor-ru3#2556086 which used `CFRunLoopRunInMode(kCFRunLoopDefaultMode, 20.0, false);` also, I'm not observing the panicing behavior from https://stackoverflow.com/questions/284325/how-to-make-child-process-die-after-parent-exits#comment98837774_6484903 – timotheecour Jul 29 '21 at 19:25
12

Does the child process have a pipe to/from the parent process? If so, you'd receive a SIGPIPE if writing, or get EOF when reading - these conditions could be detected.

MarkR
  • 62,604
  • 14
  • 116
  • 151
  • 1
    I found this didn't happen reliably, at least on OS X. – Schof Jan 10 '10 at 01:31
  • point of caution: systemd disables SIGPIPE's by default in services it manages, but you can still check for the pipe closure. See https://www.freedesktop.org/software/systemd/man/systemd.exec.html under IgnoreSIGPIPE – jdizzle Jun 02 '20 at 13:44
10

I don't believe it's possible to guarantee that using only standard POSIX calls. Like real life, once a child is spawned, it has a life of its own.

It is possible for the parent process to catch most possible termination events, and attempt to kill the child process at that point, but there's always some that can't be caught.

For example, no process can catch a SIGKILL. When the kernel handles this signal it will kill the specified process with no notification to that process whatsoever.

To extend the analogy - the only other standard way of doing it is for the child to commit suicide when it finds that it no longer has a parent.

There is a Linux-only way of doing it with prctl(2) - see other answers.

Alnitak
  • 334,560
  • 70
  • 407
  • 495
9

This solution worked for me:

  • Pass stdin pipe to child - you don't have to write any data into the stream.
  • Child reads indefinitely from stdin until EOF. An EOF signals that the parent has gone.
  • This is foolproof and portable way to detect when the parent has gone. Even if parent crashes, OS will close the pipe.

This was for a worker-type process whose existence only made sense when the parent was alive.

joonas.fi
  • 7,478
  • 2
  • 29
  • 17
  • @SebastianJylanki I don't remember if I tried, but it probably works because the primitives (POSIX streams) are fairly standard across OSs. – joonas.fi Apr 18 '18 at 12:50
6

Some posters have already mentioned pipes and kqueue. In fact you can also create a pair of connected Unix domain sockets by the socketpair() call. The socket type should be SOCK_STREAM.

Let us suppose you have the two socket file descriptors fd1, fd2. Now fork() to create the child process, which will inherit the fds. In the parent you close fd2 and in the child you close fd1. Now each process can poll() the remaining open fd on its own end for the POLLIN event. As long as each side doesn't explicitly close() its fd during normal lifetime, you can be fairly sure that a POLLHUP flag should indicate the other's termination (no matter clean or not). Upon notified of this event, the child can decide what to do (e.g. to die).

#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <poll.h>
#include <stdio.h>

int main(int argc, char ** argv)
{
    int sv[2];        /* sv[0] for parent, sv[1] for child */
    socketpair(AF_UNIX, SOCK_STREAM, 0, sv);

    pid_t pid = fork();

    if ( pid > 0 ) {  /* parent */
        close(sv[1]);
        fprintf(stderr, "parent: pid = %d\n", getpid());
        sleep(100);
        exit(0);

    } else {          /* child */
        close(sv[0]);
        fprintf(stderr, "child: pid = %d\n", getpid());

        struct pollfd mon;
        mon.fd = sv[1];
        mon.events = POLLIN;

        poll(&mon, 1, -1);
        if ( mon.revents & POLLHUP )
            fprintf(stderr, "child: parent hung up\n");
        exit(0);
    }
}

You can try compiling the above proof-of-concept code, and run it in a terminal like ./a.out &. You have roughly 100 seconds to experiment with killing the parent PID by various signals, or it will simply exit. In either case, you should see the message "child: parent hung up".

Compared with the method using SIGPIPE handler, this method doesn't require trying the write() call.

This method is also symmetric, i.e. the processes can use the same channel to monitor each other's existence.

This solution calls only the POSIX functions. I tried this in Linux and FreeBSD. I think it should work on other Unixes but I haven't really tested.

See also:

  • unix(7) of Linux man pages, unix(4) for FreeBSD, poll(2), socketpair(2), socket(7) on Linux.
Cong Ma
  • 10,692
  • 3
  • 31
  • 47
  • Very cool, I'm really wondering if this has any reliability issues though. Have you tested this in production? With different apps? – Aktau Apr 08 '13 at 20:30
  • @Aktau, I've been using the Python equivalent of this trick in a Linux program. I needed it because the child's working logic is "to do best-effort cleanup after parent exits and then exit too". However, I'm really not sure about other platforms. The C snippet works on Linux and FreeBSD but that's all I know... Also, there are cases when you should be careful, such as the parent forking again, or parent giving up the fd before truly exiting (thus creating a time window for race condition). – Cong Ma Apr 09 '13 at 11:34
  • @Aktau - This will be completely reliable. – Omnifarious Dec 06 '17 at 18:09
5

As other people have pointed out, relying on the parent pid to become 1 when the parent exits is non-portable. Instead of waiting for a specific parent process ID, just wait for the ID to change:

pit_t pid = getpid();
switch (fork())
{
    case -1:
    {
        abort(); /* or whatever... */
    }
    default:
    {
        /* parent */
        exit(0);
    }
    case 0:
    {
        /* child */
        /* ... */
    }
}

/* Wait for parent to exit */
while (getppid() != pid)
    ;

Add a micro-sleep as desired if you don't want to poll at full speed.

This option seems simpler to me than using a pipe or relying on signals.

user2168915
  • 59
  • 1
  • 1
  • 1
    Unfortunately, that solution is not robust. What if the parent process dies before you get the initial value? The child will never exit. – dgatwood Mar 18 '15 at 20:36
  • @dgatwood, what do you mean?!? The first `getpid()` is done in the parent before calling `fork()`. If the parent dies before that the child does not exist. What may happen is the child out living the parent for a while. – Alexis Wilke Mar 24 '15 at 04:29
  • In this somewhat contrived example, it works, but in real-world code, fork is almost invariably followed by exec, and the new process must start over by asking for its PPID. In the time between those two checks, if the parent goes away, the child would have no idea. Also, you're unlikely to have control over both the parent and child code (or else you could just pass the PPID as an argument). So as a general solution, that approach doesn't work very well. And realistically, if a UNIX-like OS came out without init being 1, so much stuff would break that I can't imagine anybody doing it anyway. – dgatwood Mar 30 '15 at 04:40
  • 1
    pass parent pid is command line argument when doing exec for child. – Nish Feb 17 '16 at 11:25
  • 5
    Polling at full speed is insane. – maxschlepzig Apr 30 '16 at 08:34
  • Polling at full speed, or even with "micro-sleep"ing, is insane if there isn't an exteremly good reason for it, and shouldn't be given in an answer as the default solution. – Remember Monica Sep 04 '22 at 04:05
5

Install a trap handler to catch SIGINT, which kills off your child process if it's still alive, though other posters are correct that it won't catch SIGKILL.

Open a .lockfile with exclusive access and have the child poll on it trying to open it - if the open succeeds, the child process should exit

Ana Betts
  • 73,868
  • 16
  • 141
  • 209
  • Or, the child could open the lockfile in a separate thread, in blocking mode, in which case this could be a pretty nice and clean solution. Probably it has some portability limitations though. – Jean May 05 '17 at 14:48
3

I think a quick and dirty way is to create a pipe between child and parent. When parent exits, children will receive a SIGPIPE.

Yefei
  • 71
  • 3
3

Another way to do this that is Linux specific is to have the parent be created in a new PID namespace. It will then be PID 1 in that namespace, and when it exits it all of it's children will be immediately killed with SIGKILL.

Unfortunately, in order to create a new PID namespace you have to have CAP_SYS_ADMIN. But, this method is very effective and requires no real change to the parent or the children beyond the initial launch of the parent.

See clone(2), pid_namespaces(7), and unshare(2).

Remember Monica
  • 3,897
  • 1
  • 24
  • 31
Omnifarious
  • 54,333
  • 19
  • 131
  • 194
  • 1
    I need to edit in another way. It's possible to use prctl to make a process act as the init process for all it's children and grandchildren, and great grandchildren, etc... – Omnifarious Jan 28 '20 at 07:44
  • hope you meant PR_SET_CHILD_SUBREAPER, and added this to the answer. – Remember Monica Sep 04 '22 at 04:16
  • Removed it again, I don't think PR_SET_CHILD_SUBREAPER is another way, at least, it's not documented to do so. – Remember Monica Sep 04 '22 at 04:34
  • @RememberMonica - It doesn't. PR_SET_CHILD_SUBREAPER is to make sure that a particular process can catch all child/grandchild/great-grandchild/etc... exit statuses. The pid namespace will also accomplish this of course, but that's a side-effect. – Omnifarious Sep 04 '22 at 20:38
1

In case it is relevant to anyone else, when I spawn JVM instances in forked child processes from C++, the only way I could get the JVM instances to terminate properly after the parent process completed was to do the following. Hopefully someone can provide feedback in the comments if this wasn't the best way to do this.

1) Call prctl(PR_SET_PDEATHSIG, SIGHUP) on the forked child process as suggested before launching the Java app via execv, and

2) Add a shutdown hook to the Java application that polls until its parent PID equals 1, then do a hard Runtime.getRuntime().halt(0). The polling is done by launching a separate shell that runs the ps command (See: How do I find my PID in Java or JRuby on Linux?).

EDIT 130118:

It seems that was not a robust solution. I'm still struggling a bit to understand the nuances of what's going on, but I was still sometimes getting orphan JVM processes when running these applications in screen/SSH sessions.

Instead of polling for the PPID in the Java app, I simply had the shutdown hook perform cleanup followed by a hard halt as above. Then I made sure to invoke waitpid in the C++ parent app on the spawned child process when it was time to terminate everything. This seems to be a more robust solution, as the child process ensures that it terminates, while the parent uses existing references to make sure that its children terminate. Compare this to the previous solution which had the parent process terminate whenever it pleased, and had the children try to figure out if they had been orphaned before terminating.

Community
  • 1
  • 1
jasterm007
  • 163
  • 3
  • 9
  • 1
    The `PID equals 1` wait is not valid. The new parent could be some other PID. You should check whether it changes from the original parent (getpid() before the fork()) to the new parent (getppid() in the child not equal the getpid() when called before the fork()). – Alexis Wilke Mar 24 '15 at 04:32
1

Under POSIX, the exit(), _exit() and _Exit() functions are defined to:

  • If the process is a controlling process, the SIGHUP signal shall be sent to each process in the foreground process group of the controlling terminal belonging to the calling process.

So, if you arrange for the parent process to be a controlling process for its process group, the child should get a SIGHUP signal when the parent exits. I'm not absolutely sure that happens when the parent crashes, but I think it does. Certainly, for the non-crash cases, it should work fine.

Note that you may have to read quite a lot of fine print - including the Base Definitions (Definitions) section, as well as the System Services information for exit() and setsid() and setpgrp() - to get the complete picture. (So would I!)

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • 3
    Hmm. The documentation is vague and contradictory on this, but it appears that the parent process must be the lead process for the session, not just the process group. The lead process for the session was always login, and getting my process to take over as lead process for a new session was beyond my abilities at the moment. – Schof Jan 10 '10 at 01:33
  • 2
    SIGHUP effectively only gets sent to child processes if the exiting process is a login shell. http://www.opengroup.org/onlinepubs/009695399/functions/exit.html "Termination of a process does not directly terminate its children. The sending of a SIGHUP signal as described below indirectly terminates children /in some circumstances/." – Rob K Jul 06 '10 at 21:01
  • 1
    @Rob: correct - that's what the quote I gave says, too: that only in some circumstances does the child process get a SIGHUP. And it is strictly an over-simplification to say that it is only a login shell that sends SIGHUP, though that is the most common case. If a process with multiple children sets itself up as the controlling process for itself and its children, then the SIGHUP will (conveniently) be sent to its children when the master dies. OTOH, processes seldom go to that much trouble - so I more more nit-picking than raising a really significant quibble. – Jonathan Leffler Jul 06 '10 at 22:36
  • 2
    I fooled around with it for a couple of hours and couldn't get it to work. It would have nicely handled a case where I have a daemon with some children that all need to die when the parent exits. – Rob K Aug 17 '10 at 19:36
1

Historically, from UNIX v7, the process system has detected orphanity of processes by checking a process' parent id. As I say, historically, the init(8) system process is a special process by only one reason: It cannot die. It cannot die because the kernel algorithm to deal with assigning a new parent process id, depends on this fact. when a process executes its exit(2) call (by means of a process system call or by external task as sending it a signal or the like) the kernel reassigns all children of this process the id of the init process as their parent process id. This leads to the most easy test, and most portable way of knowing if a process has got orphan. Just check the result of the getppid(2) system call and if it is the process id of the init(2) process then the process got orphan before the system call.

Two issues emerge from this approach that can lead to issues:

  • first, we have the possibility of changing the init process to any user process, so How can we assure that the init process will always be parent of all orphan processes? Well, in the exit system call code there's a explicit check to see if the process executing the call is the init process (the process with pid equal to 1) and if that's the case, the kernel panics (It should not be able anymore to maintain the process hierarchy) so it is not permitted for the init process to do an exit(2) call.
  • second, there's a race condition in the basic test exposed above. Init process' id is assumed historically to be 1, but that's not warranted by the POSIX approach, that states (as exposed in other response) that only a system's process id is reserved for that purpose. Almost no posix implementation does this, and you can assume in original unix derived systems that having 1 as response of getppid(2) system call is enough to assume the process is orphan. Another way to check is to make a getppid(2) just after the fork and compare that value with the result of a new call. This simply doesn't work in all cases, as both call are not atomic together, and the parent process can die after the fork(2) and before the first getppid(2) system call. The processparent id only changes once, when its parent does anexit(2)call, so this should be enough to check if thegetppid(2)result changed between calls to see that parent process has exit. This test is not valid for the actual children of the init process, because they are always children ofinit(8)`, but you can assume safely these processes as having no parent either (except when you substitute in a system the init process)
Luis Colorado
  • 10,974
  • 1
  • 16
  • 31
1

If you send a signal to the pid 0, using for instance

kill(0, 2); /* SIGINT */

that signal is sent to the entire process group, thus effectively killing the child.

You can test it easily with something like:

(cat && kill 0) | python

If you then press ^D, you'll see the text "Terminated" as an indication that the Python interpreter have indeed been killed, instead of just exited because of stdin being closed.

  • 1
    `(echo -e "print(2+2)\n" & kill 0) | sh -c "python -"` happily prints 4 instead of Terminated – Kamil Szot Apr 20 '14 at 13:29
  • @KamilSzot Your example simply contains a race condition and has nothing to do with this question. – Remember Monica Sep 04 '22 at 04:39
  • @RememberMonica Why doesn't word Terminated show up in this case? To clarify I see `4` only for `(echo -e "print(2+2)\n" && kill 0) | sh -c "python -"` in Windows bash. Not with single `&` and not on Ubuntu in WSL (neither `&` nor `&&`). – Kamil Szot Sep 15 '22 at 12:27
0

I found 2 solutions, both not perfect.

1.Kill all children by kill(-pid) when received SIGTERM signal.
Obviously, this solution can not handle "kill -9", but it do work for most case and very simple because it need not to remember all child processes.


    var childProc = require('child_process').spawn('tail', ['-f', '/dev/null'], {stdio:'ignore'});

    var counter=0;
    setInterval(function(){
      console.log('c  '+(++counter));
    },1000);

    if (process.platform.slice(0,3) != 'win') {
      function killMeAndChildren() {
        /*
        * On Linux/Unix(Include Mac OS X), kill (-pid) will kill process group, usually
        * the process itself and children.
        * On Windows, an JOB object has been applied to current process and children,
        * so all children will be terminated if current process dies by anyway.
        */
        console.log('kill process group');
        process.kill(-process.pid, 'SIGKILL');
      }

      /*
      * When you use "kill pid_of_this_process", this callback will be called
      */
      process.on('SIGTERM', function(err){
        console.log('SIGTERM');
        killMeAndChildren();
      });
    }

By same way, you can install 'exit' handler like above way if you call process.exit somewhere. Note: Ctrl+C and sudden crash have automatically been processed by OS to kill process group, so no more here.

2.Use chjj/pty.js to spawn your process with controlling terminal attached.
When you kill current process by anyway even kill -9, all child processes will be automatically killed too (by OS?). I guess that because current process hold another side of the terminal, so if current process dies, the child process will get SIGPIPE so dies.


    var pty = require('pty.js');

    //var term =
    pty.spawn('any_child_process', [/*any arguments*/], {
      name: 'xterm-color',
      cols: 80,
      rows: 30,
      cwd: process.cwd(),
      env: process.env
    });
    /*optionally you can install data handler
    term.on('data', function(data) {
      process.stdout.write(data);
    });
    term.write(.....);
    */

Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156
osexp2000
  • 2,910
  • 30
  • 29
0

I managed to do a portable, non-polling solution with 3 processes by abusing terminal control and sessions.

The trick is:

  • process A is started
  • process A creates a pipe P (and never reads from it)
  • process A forks into process B
  • process B creates a new session
  • process B allocates a virtual terminal for that new session
  • process B installs SIGCHLD handler to die when the child exits
  • process B sets a SIGPIPE handler
  • process B forks into process C
  • process C does whatever it needs (e.g. exec()s the unmodified binary or runs whatever logic)
  • process B writes to pipe P (and blocks that way)
  • process A wait()s on process B and exits when it dies

That way:

  • if process A dies: process B gets a SIGPIPE and dies
  • if process B dies: process A's wait() returns and dies, process C gets a SIGHUP (because when the session leader of a session with a terminal attached dies, all processes in the foreground process group get a SIGHUP)
  • if process C dies: process B gets a SIGCHLD and dies, so process A dies

Shortcomings:

  • process C can't handle SIGHUP
  • process C will be run in a different session
  • process C can't use session/process group API because it'll break the brittle setup
  • creating a terminal for every such operation is not the best idea ever
Joundill
  • 6,828
  • 12
  • 36
  • 50
0

Even though 7 years have passed I've just run into this issue as I'm running SpringBoot application that needs to start webpack-dev-server during development and needs to kill it when the backend process stops.

I try to use Runtime.getRuntime().addShutdownHook but it worked on Windows 10 but not on Windows 7.

I've change it to use a dedicated thread that waits for the process to quit or for InterruptedException which seems to work correctly on both Windows versions.

private void startWebpackDevServer() {
    String cmd = isWindows() ? "cmd /c gradlew webPackStart" : "gradlew webPackStart";
    logger.info("webpack dev-server " + cmd);

    Thread thread = new Thread(() -> {

        ProcessBuilder pb = new ProcessBuilder(cmd.split(" "));
        pb.redirectOutput(ProcessBuilder.Redirect.INHERIT);
        pb.redirectError(ProcessBuilder.Redirect.INHERIT);
        pb.directory(new File("."));

        Process process = null;
        try {
            // Start the node process
            process = pb.start();

            // Wait for the node process to quit (blocking)
            process.waitFor();

            // Ensure the node process is killed
            process.destroyForcibly();
            System.setProperty(WEBPACK_SERVER_PROPERTY, "true");
        } catch (InterruptedException | IOException e) {
            // Ensure the node process is killed.
            // InterruptedException is thrown when the main process exit.
            logger.info("killing webpack dev-server", e);
            if (process != null) {
                process.destroyForcibly();
            }
        }

    });

    thread.start();
}
Ido Ran
  • 10,584
  • 17
  • 80
  • 143
0

I've passed parent pid using environment to the child, then periodically checked if /proc/$ppid exists from the child.

-1

If parent dies, PPID of orphans change to 1 - you only need to check your own PPID. In a way, this is polling, mentioned above. here is shell piece for that:

check_parent () {
      parent=`ps -f|awk '$2=='$PID'{print $3 }'`
      echo "parent:$parent"
      let parent=$parent+0
      if [[ $parent -eq 1 ]]; then
        echo "parent is dead, exiting"
        exit;
      fi
}


PID=$$
cnt=0
while [[ 1 = 1 ]]; do
  check_parent
  ... something
done
mymedia
  • 572
  • 6
  • 26
alex K
  • 7
  • 1