4

I am using nagios check_logwarn to capture changes to log files.

In order to test my setup, I have been manually adding the following log line to the concerned log file -

[Mon Mar 20 14:24:31 2017] [hphp] [12082:7f238d3ff700:32:000001] []
\nFatal error: entire web request took longer than 10 seconds and timed out in /var/cake_1.2.0.6311-beta
app/webroot/openx/www/delivery/postGetAd.php on line 483

The above should get caught by the following nagios command, because it contains the keyword "Fatal"

/usr/local/nagios/libexec/check_logwarn -d /tmp/logwarn_hiphop_error -p /mnt/log/hiphop/error_`(date +'%Y%m%d')`.log "^.*Fatal*"

Output (as expected) -

Log errors: \nFatal error: entire web request took longer than 10 seconds and timed out in /var/cake_1.2.
0.6311-beta
\nFatal error: entire web request took longer than 10 seconds and timed out in /var/cake_1.2.0.6311-beta

Running this command directly works (case 1), but it seems invoking the same via a PHP exec which is triggered via a Jenkins project is not catching the same (case 2).

Following is the PHP code of case 2 -

$errorLogCommand = '/usr/local/nagios/libexec/check_logwarn -d /tmp/logwarn_hiphop_error -p /mnt/log/hiphop/error_'.$date.'.log "^.*Fatal*"';
$output = exec($errorLogCommand);
file_put_contents('/var/cake_1.2.0.6311-beta/deployment/deployment.log', "\n ".date("Y-m-d H:i:s")." Checked error key words in error_".$date.".log. command -> ".$errorLogCommand, FILE_APPEND);
if($output!="OK: No log errors found")
{
    file_put_contents('/var/cake_1.2.0.6311-beta/deployment/deployment.log', "\n ".date("Y-m-d H:i:s")." - Hiphop errors -> ".$output, FILE_APPEND);

    $failure=true;
    break;
}
else
{
    file_put_contents('/var/cake_1.2.0.6311-beta/deployment/deployment.log', "\n ".date("Y-m-d H:i:s")." - No Error found -> ".$output, FILE_APPEND);
}

Following is the output -

 2017-03-20 14:16:45 Checked error key words in error_20170320.log. command -> /usr/local/nagios/libexec/
check_logwarn -d /tmp/logwarn_hiphop_error -p /mnt/log/hiphop/error_20170320.log "Fatal"
 2017-03-20 14:16:45 - No Error found -> OK: No log errors found

Note that with the same nagios command (/usr/local/nagios/libexec/check_logwarn) as in case 1, log error is not detected in this case, unexpectedly.

Following are my observations of the contents of the internal tracker file which nagios generates - /tmp/logwarn_hiphop_error/mnt_log_hiphop_error_20170320.log -

When error is detected in case 1, following are the changes in the file -

Before running command

# logwarn 1.0.10 state for "/mnt/log/hiphop/error_20170320.log"
INODENUM="1208110246"
LINENUM="110"
POSITION="111627"
MATCHING="true"

After running command

# logwarn 1.0.10 state for "/mnt/log/hiphop/error_20170320.log"
INODENUM="1208110246"
LINENUM="116"
POSITION="112087"
MATCHING="false"

Also, following are the changes to the same file in case 2 -

Before running php file

# logwarn 1.0.10 state for "/mnt/log/hiphop/error_20170320.log"
INODENUM="1208110246"
LINENUM="102"
POSITION="109329"
MATCHING="true"

After

# logwarn 1.0.10 state for "/mnt/log/hiphop/error_20170320.log"
INODENUM="1208110246"
LINENUM="110"
POSITION="111627"
MATCHING="true"

I am not sure why the MATCHING parameter is true in the case 2, whereas in case 1 it is false. In fact, the error matching happened in case 1.

Update

I tried wrapping the command in an escapeshellcmd, to ensure that the regex is not being stripped out -

$output = exec(escapeshellcmd($errorLogCommand)); 

but still no change in output.

Update 2

Found that I had line breaks in the log line I was manually adding. Removing those fixed it consistently for the case of running the PHP file from command line. However, the problem is still reproducible consistently for the case 2, where I am triggering the project via Jenkins and this file gets called in one of the hooks of AWS code deploy.

Well, it seems this is not going to get solved so easily. The problem got fixed for manual invocation of the PHP file, but on invocation via Jenkins, I am still getting the same problem consistently.

Sandeepan Nath
  • 9,966
  • 17
  • 86
  • 144
  • If you look at the output in the line that puts the command into the logfile, with this: Checked error key words.... it looks like the regex is being stripped out. Maybe you should try http://php.net/manual/en/function.escapeshellarg.php – Nagios Support Mar 20 '17 at 15:24
  • Ok I tried with `escapeshellcmd` instead - `$output = exec(escapeshellcmd($errorLogCommand));` but still no change in output. – Sandeepan Nath Mar 21 '17 at 07:12
  • Got it! It was happening because I had line breaks in the log line I was manually adding. However, not sure how the direct command is working with those line breaks present. – Sandeepan Nath Mar 22 '17 at 07:37
  • Well, it seems this is not going to get solved so easily. The problem got fixed for manual invocation of the PHP file, but on invocation via Jenkins, I am still getting the same problem consistently. – Sandeepan Nath Mar 22 '17 at 14:07
  • Actually I realized that the same check_logwarn command was being run at the beginning of the afterInstall step, so that the internal tracker is brought corresponding to the current timestamp/instant, so that only the logs in afterInstall step are scanned at the end of the afterInstall step, for sanity. I think the information provided in the question may not have been sufficient to find this out. When I was manually running one file, only the final check_logwarn command was being run, which was rightly capturing the error log. – Sandeepan Nath Mar 30 '17 at 09:51

1 Answers1

1

The logwarn documentation mentions support for a negative checking expression.

Please try pre-pending an exclamation mark (!) before the pattern string to exclude rather than include these matches

Otávio Barreto
  • 1,536
  • 3
  • 16
  • 35
  • I found the issue - http://stackoverflow.com/questions/42906177/unable-to-capture-changes-to-log-file-via-nagios-check-logwarn-plugin-command-in#comment73308229_42906177. Not sure how prepending an exclamation mark is going to help here.. – Sandeepan Nath Mar 30 '17 at 10:04