2

Updated:

I have a unittest (consider it X). There is a subprocess (Y). Y does not explicitly fork. X forks so it can exec Y. Here is X:

warn("Starting Y...");
my $pid;

die "failed to fork" unless($pid = fork);
unless($pid) {
    { exec "exec /usr/bin/Y"; };
    warn "failed: $!";
    _exit(0);
}

# Do something to test Y.

warn("Stopping Y...");
my $st;
do {
    kill(15, $pid);
    $st = waitpid($pid, WNOHANG);
} while ($st > 0);

warn("Y has stopped");

The output I get from X is:

Starting Y... at ...
Some stuff.
Stopping Y.... at ...
Y has stopped at ...

This would suggest Y has received the signal and stopped. But Y doesn't stop, it goes into a defunct state (what I previously referred to as "zombie like"). As far as I can tell, it is a finished process, not available to kill(0, $pid) but visible through ps and /proc.

In the actual process Y, it is a different story:

sub goodbye {
    warn("Received signal");
    exit(0);
}

$SIG{TERM} = \&goodbye;
$SIG{INT} = \&goodbye;

warn("Starting server...");
my $d = HTTP::Daemon->new(
    LocalAddr => "127.0.0.1",
    LocalPort => 81,
    Reuse => 1
);

while (my $c = $d->accept) {
    # Do some stuff

    $c->close;
}

warn("Exiting...");

I never see the Exiting... nor the Received signal in the output for Y, only the Starting server message. Y is functioning and accepts all connections from X, thus passes the tests; it just fails to stop.

This is the output from some debug before and after X signals Y.

DEBUG before kill:
root      2843  2795  0 11:23 pts/4    00:00:00 /usr/bin/perl /bin/Y
root      2844  2843  4 11:23 pts/4    00:00:00 /usr/bin/perl /bin/Y

DEBUG after kill:
root      2843  2795  0 11:23 pts/4    00:00:00 [Y] <defunct>
root      2844     1  4 11:23 pts/4    00:00:00 /usr/bin/perl /bin/Y

Note the defunct state of Y and notice that there are two processes. I didn't start two and it doesn't fork, so I am assuming HTTP::Daemon forked. I explicitly modified X to send a different signal. This time I sent SIGINT, like when I hit Ctrl-C and it actually stops the hung X and both Ys. I get the Signal received message from Y this time, but it stills goes into a defunct state and there are still two processes.

My question is targeted at HTTP::Daemon as opposed to Perl. What on earth is HTTP::Daemon (derived from IO::Socket::INET) actually doing to cause this chaos and why? Secondly, how do I adapt Y to cope with what HTTP::Daemon is doing?

Craig
  • 4,268
  • 4
  • 36
  • 53
  • 1
    Why don't you use the typical way of establishing a reaper proc for `$SID{CHILD}`? E.g. as shown in http://perldoc.perl.org/perlipc.html (look for `sub REAPER`). – Moritz Bunkus Sep 05 '13 at 11:47
  • Your explanation makes no sense to me. A zombie process is one whose exit code hasn't been collected with `wait` or `waitpid`. To my knowledge, it has nothing to do with whether it has a child or not. In fact, 2844 is no longer a child of 2843 as you can see in the second screenshot. Did you use `ps` before or after the `waitpid`? – ikegami Sep 05 '13 at 13:44
  • Your explanation is rather poor. Are you saying process X is creating process Y (2843) which creates process Z (2844), and that the problem is that Z doesn't get killed when X kills Y? You could kill Z in Y's signal handler, or X could send the signal to Y's [process group](http://en.wikipedia.org/wiki/Process_group) – ikegami Sep 05 '13 at 13:50
  • X is running as a unittest for Y. X starts Y, sends some requests, closes the connection and signals Y. Y doesn't receive the signal, so X hangs. The process information I posted with the question is the state of the processes for Y before and after X sends SIGTERM. If I press Ctrl-C (send SIGINT) to X (the unittest that's hung and certainly did a waitpid), Y receives the SIGINT signal. There is no difference in how the signals are handled, so I am assuming HTTP::Daemon is setting its own handlers. I'll update the question. – Craig Sep 05 '13 at 18:09
  • Okay, I think I know what is happening. There is a bug in HTTP::Daemon where it is performing a `sysread()` on the socket. It requests more data than is available, so causes a select statement to hang. This is preventing any signals from being handled and prevents Y from being reaped. – Craig Sep 05 '13 at 18:33
  • @Moritz Bunkus: Yes, thanks, I am already aware of `$SIG{CHLD}`. This doesn't help at all, since the child would be occupying the port preventing the next batch of tests from running. It is better that it hangs and waits for the subprocess to finish. – Craig Sep 05 '13 at 18:37

1 Answers1

0

Okay, I may not have asked this question in the best possible way the first time round. But that doesn't mean it wasn't a valid one for which I knew that someone out there may hold the answer. But yet again, here I am, fulfilling the answer to my own question. Rather than regurgitate the answer I found, here it is in all its glory:

Strange blocking issue with HTTP::Daemon

Craig
  • 4,268
  • 4
  • 36
  • 53