2

I am not sure if this question is specific to Perl, but it is the language I am using. Say I launch a background process to save a web page to a local file like this:

system("curl http://google.com > output_file.html &");

I know this will launch a background process, though I'm not totally sure of the details (for example does it get its own PID?). But what's particularly important to me is, what happens if the process that launched it terminates before curl finishes downloading? Will curl be allowed to continue, or will it terminate too?

Is there any reason the solution wouldn't be to prepend the above command with nohup (nohup curl ...)? See http://linux.101hacks.com/unix/nohup-command/

Stephen
  • 8,508
  • 12
  • 56
  • 96
  • According to this doc: https://perldoc.perl.org/functions/fork.html, you may end up with zombies if you don't wait for the children. So the parent zombie will wait :). – Mark Mucha Nov 02 '17 at 03:05
  • 2
    @MarkMucha Not if the parent exits first, since then `init` re-parents the child. – zdim Nov 03 '17 at 18:25

1 Answers1

4

Yes, your backgrounded process should complete even if the script exits first.

The system call forks, what means that at that point a new, independent, process is created as a near clone of the parent. That process is then replaced by the command to run, or by a shell that will run the command. The system then waits for the child process to complete.

The & in the command makes sure that it is a shell that is run by the system, which then executes the command. The shell itself forks a process (subshell), in which the command is executed, and doesn't wait for it but returns right away.

At that point system's job is done and it returns control to the script.

The fate of the process forked by the shell has nothing more to do with the shell, or with your script, and the process will run to its completion. The parent may well exit right away. See this with

use warnings;
use strict;
use feature 'say';

system("(sleep 5; echo hi) &");

say "Parent exiting.";

or, from a terminal

perl -wE'system("(sleep 3; echo me)&"); say "done"'

Once in the shell, the () starts a sub-shell, used here to put multiple commands in the background for this example (and representing your command). Keep that in mind when tracking process IDs via bash internal variables $BASHPID, $$, $PPID (here $$ differs from $BASHPID)

perl -wE'say $$; system("
    ( sleep 30; echo \$BASHPID; echo \$PPID; echo \$\$ ) &
"); say "done"'

Then view processes while this sleeps (by ps aux on my system, with | tail -n 10).

Most of the time the PID of a system-run command will be by two greater than that of the script, as there is a shell between them (for a backgrounded process as well, on my system). In the example above it should be greater by 3, because of an additional () subshell with mulitple commands.

This assumes that the /bin/sh which system uses does get relegated to bash.

Note: when the parent exits first the child is re-parented by init and all is well (no zombies).


  From system

Does exactly the same thing as exec, except that a fork is done first and the parent process waits for the child process to exit. Note that argument processing varies depending on the number of arguments. If there is more than one argument in LIST, or if LIST is an array with more than one value, starts the program given by the first element of the list with arguments given by the rest of the list. If there is only one scalar argument, the argument is checked for shell metacharacters, and if there are any, the entire argument is passed to the system's command shell for parsing (this is /bin/sh -c on Unix platforms, but varies on other platforms). If there are no shell metacharacters in the argument, it is split into words and passed directly to execvp, which is more efficient. ...

The "... starts the program given by the first element ..." also means by execvp, see exec.

zdim
  • 64,580
  • 5
  • 52
  • 81
  • So just to confirm one detail, when I am using ‘&’, Perl would still call wait but the wait would be immediately satisfied by the first shell returning? And in this case, is the creation of the first shell done *instead* of fork? Or is it something like fork->exec shell? – Stephen Nov 02 '17 at 17:16
  • 1
    @Stephen Correct, `system` always `wait`s but the shell that it created returns right after forking off a subshell so the `system` is all done right away (milliseconds?). When the `system` call `fork`s, the new process (child) `exec`s the external program to be run -- or it `exec`s a shell (which will then run that program) -- it turns into it. In this case it's a shell so the first shell replaces the child that `system` forked. – zdim Nov 02 '17 at 17:23
  • @Stephen A _very_ crude sketch: `system($cmd)` ==> `my $pid = fork // die $!; if ($pid==0) { exec $cmd }; wait;`, except that here it is the shell that is `exec`-ed. A shell is used to run a command whenever there are shell metacharacters in it; in your case it's both `>` and `&`. Those are shell things so Perl will start a shell and pass the command to it. If you use a list, `system(@cmd)`, then it won't -- it will directly run that program (using `execvp`). But then you can't have `>` and alike in it. – zdim Nov 02 '17 at 18:42
  • So I guess there must be two forks and two execs in total, right? Clearly we'd need two forks if there is a child and then a grandchild. And then the two execs would be for replacing the forked program with the new program in each case? – Stephen Nov 02 '17 at 21:23
  • @Stephen I'd imagine that, yes -- except that I'm not sure what exactly the shell does, and that depends on the system as well. But conceptually (and perhaps literally), yes. I edited the answer, hopefully it's better. – zdim Nov 02 '17 at 21:27