0

I have a similar problem like it was on question Memcached status returning "memcached dead but pid file exists" but I can't solve it.

My Nagios client is CentOS 6.4 and NRPE won't work after power outage (which forcedly stopped CentOS). When I type, I get the following:

/sbin/service nrpe status
nrpe dead but pid file exists

Also, file /var/run/nrpe.pid has the process id, but when trying to kill it, I get:

-bash: kill: (3879) - No such process

and on /var/lock/subsys/ I have empty nrpe file. So, please help me to solve this issue.

Zoran
  • 161
  • 3
  • 10

2 Answers2

1

When the NRPE shuts down cleanly, it's cleaning up the .pid file. That's why, if the pid file exists, the init script assumes the NRPE is still running.

But in your case you say the machine has just powered off, this means NRPE wasn't able to clean up it's pid file. This means now you can just remove the .pid file and start NRPE again.

replay
  • 3,240
  • 14
  • 17
  • Sorry, I didn't specified that I did that already, delete nrpe.pid and after that restart service. But, without success at all. Problem remains – Zoran Aug 21 '13 at 12:55
  • 1
    Are you sure you are deleting the correct pid file? maybe you should verify by looking at the init script. – replay Aug 21 '13 at 12:57
  • I am positive - /var/run/nrpe.pid - that is the file – Zoran Aug 21 '13 at 13:01
  • please provide me with info in how to check it with init sscript - not sure where and what to check there – Zoran Aug 21 '13 at 13:03
  • basically you have to find the definition of the path to the .pid file. somewhere in the init script there should be a line where it sets the location of the pid file, and this is probably the one it's using. – replay Aug 21 '13 at 13:16
  • @Zoran You open the init script with a text editor, and look for the line that specifies what PID file it checks against. – voretaq7 Aug 21 '13 at 13:17
  • @voretaq7 - It checks /var/lock/subsys/nrpe, which is empty file and I am again at the beginning – Zoran Aug 21 '13 at 13:23
  • @Zoran have you tried to just delete the `/var/log/subsys/nrpe` and then start the service? – replay Aug 21 '13 at 13:26
  • @mauro.stettler - Yes, I did - and did it again few minutes ago. Same result, unfortunately – Zoran Aug 21 '13 at 13:28
0

Can you delete the pid file (likely in /var/run)?

TheFiddlerWins
  • 2,999
  • 1
  • 15
  • 22
  • Yes, I can and did that already. But problem remains – Zoran Aug 21 '13 at 12:56
  • Does "service npre stop" work? You may want to try running the command with strace so you can see what's happening. – TheFiddlerWins Aug 21 '13 at 16:47
  • Thanks for a tip, but with strace -p command, answer is that such process does not exist (and on nrpe.pid file I have process number). – Zoran Aug 22 '13 at 08:28
  • I am a little confused, does npre.pid exist or not? Can you update locatedb and make sure that there are no other pid files, perhaps its not in the default directory? – TheFiddlerWins Aug 22 '13 at 12:47
  • .pid does exist but nrpe is "dead" according to meesage which I receive after asking for a status of nrpe process. Deleting, updating, restarting - doesn't help at all. I already tried everything (and reinstall nrpe, also) before I asked for a help here – Zoran Aug 23 '13 at 06:27
  • Please change npre to not auto start via chkconfig, make sure it's stopped (service npre stop) and then do "locate npre.pid" and delete the results. Restart your box and then run "service npre status" and post the results. – TheFiddlerWins Aug 23 '13 at 13:47