Nagios retry interval when they have OK or UP state

Question

I configure one Linux Host to Nagios Monitoring Server Using NRPE Plugin. For this I follow the below URL

http://www.tecmint.com/how-to-add-linux-host-to-nagios-monitoring-server/

I have to check some services of Linux Host. For monitoring linux host and services of that host, I am using nagios log(/usr/local/nagios/var/nagios.log)

First time all good in my nagios log that showing me as below status

SERVICE ALERT: test.testing.local;Service Tomcat;OK;SOFT;6;TOMCAT OK

When my Service status is change to non-OK state than it showing me on log

SERVICE ALERT: test.testing.local;Service Tomcat;CRITICAL;SOFT;4;TOMCAT CRITICAL

But I want that if my Service status is not change to non-OK state than again after 1 minute it show me on log

SERVICE ALERT: test.testing.local;Service Tomcat;OK;SOFT;6;TOMCAT OK

and currently that is not happening.

My services.cfg file content is given below

define service {
    host_name                       test.testing.local
    service_description             Service Tomcat
    check_command                   check_nrpe!check_service_tomcat
    max_check_attempts              10
    check_interval                  1
    retry_interval                  1
    active_checks_enabled           1
    check_period                    24x7
    register                        1
}

I am using Nagios 4.2.2 and CentOS 7.

Did the answer help get it working? Let me know if you need more help — lgroschen, Dec 22 '16 at 20:21

score 2 · Answer 1 · edited Jun 20 '20 at 09:12

I think what you are after is from the Nagios 4 Core docs here

check_interval: This directive is used to define the number of "time units" between regularly scheduled checks of the host. Unless you've changed the interval_length directive from the default value of 60, this number will mean minutes. More information on this value can be found in the check scheduling documentation.

retry_interval: This directive is used to define the number of "time units" to wait before scheduling a re-check of the hosts. Hosts are rescheduled at the retry interval when they have changed to a non-UP state. Once the host has been retried max_check_attempts times without a change in its status, it will revert to being scheduled at its "normal" rate as defined by the check_interval value. Unless you've changed the interval_length directive from the default value of 60, this number will mean minutes. More information on this value can be found in the check scheduling documentation.

If you were to set your check_interval to 1 minute (which is pretty frequent, the default as you can see is 60) you will retry every 1 minute 10 times (max_check_attempts in your config) without a change in status then it will give you an OK/UP state.

Nagios retry interval when they have OK or UP state

1 Answers1