1

I wrote a custom check in bash, and run it on 4 different servers. Two of them work fine, and the other two fail when checking if a file exists with:

if [ ! -f $LOGFILE ]

By "fail" I mean that on those two servers the script decided that $LOGFILE doesn't exist (which is false).

All four servers have the same configuration, permissions, etc. The file does exist on all servers. When run manually, there is no error. When run manually as the nagios or nrpe users, there is no error. It only fails when run remotely via nagios with check_nrpe -H ... -c ...

I thought that perhaps my bash skills were a bit rusty, so I re-wrote the check in Python. Now, the same two servers fail, but the error is

NRPE: Unable to read output

Again, same Python version in all servers. However, I found that the servers with the error render this message:

$ sudo grep nagios /var/log/messages
Jul 19 11:09:15 app-a abrt: detected unhandled Python exception in '/usr/local/nagios/libexec/check_redirects'

As I said, I have already checked for differences in nagios configuration (both on the nagios master and the clients), in permissions, in python versions... Everything seems the same.

I found lots of questions about different checks working/failing on the same server. This is the exact same check working on some servers but not others.

Any thoughts would be much appreciated. Thanks.

ilvidel
  • 111
  • 3

0 Answers0