The function that's taking the time is written in C, which precludes you from using a custom signal handler safely.
You don't appear worried about terminating your program forcefully, so I suggest you use alarm
without a signal handler to terminate your program forcefully if it takes too long to run, and using a wrapper to provide the correct response to Nagios.
Change
/path/to/program some args
to
/path/to/timeout_wrapper 30 /path/to/program some args
The following is timeout_wrapper
:
#!/usr/bin/perl
use strict;
use warnings;
use POSIX qw( WNOHANG );
use Time::HiRes qw( sleep time );
sub wait_for_child_to_complete {
my ($pid, $timeout) = @_;
my $wait_until = time + $timeout;
while (time < $wait_until) {
waitpid($pid, WNOHANG)
and return $?;
sleep(0.5);
}
return undef;
}
{
my $timeout = shift(@ARGV);
defined( my $pid = fork() )
or exit(3);
if (!$pid) {
alarm($timeout); # Optional. The parent will handle this anyway.
exec(@ARGV)
or exit(3);
}
my $timed_out = 0;
my $rv = wait_for_child_to_complete($pid, $timeout);
if (!defined($rv)) {
$timed_out = 1;
if (kill(ALRM => $pid)) {
$rv = wait_for_child_to_complete($pid, 5);
if (!defined($rv)) {
kill(KILL => $pid)
}
}
}
exit(2) if $timed_out;
exit(3) if $rv & 0x7F; # Killed by some signal.
exit($rv >> 8); # Expect the exit code to comply with the spec.
}
Uses the Nagios Plugin Return Codes. Timeouts should actually return 2
.