Munin Disk Latency Alerts

Question

I have setup my Munin server and alerts and have tested them aswell. I have set the alerts for disk usage as under:

df._dev_mapper_centos_root.warning 90
df._dev_md126p2.warning 90
df._dev_md126p1.warning 90
df._dev_mapper_centos_home.warning 90

I have received the alert for above (for testing I kept the lower values) in my email:

>  sha :: Server2 :: Disk usage in percent
>         WARNINGs: /boot is 33.48 (outside range [:33]), / is 17.95 (outside range [:17]), /boot/efi is 4.73 (outside range [:4]).
> 
> sha :: Server1 :: Disk usage in percent
>         OKs: /boot is 33.48, / is 17.95, /boot/efi is 4.73

Problem I am facing now is that I am getting Disk Latency Alerts and I cannot find any values to alter the alerts. Here are couple of alerts triggered by Munin:

> sha :: Server1 :: Disk latency per device :: Average latency
> for /dev/centos/swap
>         WARNINGs: Write IO Wait time is 4.89 (outside range [0:3]).
> 
> sha :: Server1 :: Disk latency per device :: Average latency
> for /dev/centos/home
>         WARNINGs: Write IO Wait time is 10.64 (outside range [0:3])

.

Even though graph for Disk latency per device is present for this server but when I telnet to node I don’t get any plugin for this to fetch the value:

telnet 192.168.10.252 4949
Trying 192.168.10.252...
Connected to 192.168.10.252.
Escape character is '^]'.
# munin node at localhost.localdomain
list
acpi cpu df df_inode entropy exim_mailqueue forks fw_conntrack 
fw_forwarded_local fw_packets hddtemp_smartctl if_enp2s0 if_err_enp2s0 
interrupts irqstats load memory netstat open_files open_inodes 
postfix_mailqueue proc_pri processes swap threads uptime users vmstat

I am not sure if I have explained it properly or not and sorry if you think it’s a silly question. I just want to either stop these alerts altogether or set the value high. I hope I will get some help here.

H4R0 · Accepted Answer · 2017-06-20T07:23:32.577

4

It´s probably the diskstats_latency plugin, try the following:

diskstats_latency.centos_home.avgwrwait.warning 0:15
diskstats_latency.centos_home.avgrdwait.warning 0:15
diskstats_latency.centos_swap.avgwrwait.warning 0:15
diskstats_latency.centos_swap.avgrdwait.warning 0:15

Please note that this is for both write (avgwrwait) and read (avgrdwait) latency.

I set the range to 0:15 which will almost completely disable the warnings as you wanted.

Dont forget to restart the munin daemon

systemctl restart munin-node

edited Jun 20 '17 at 07:23

answered Jun 20 '17 at 07:12

H4R0

56
6

Where should I write these settings? I tried couple of files but had no luck. – Henrik Heino Oct 19 '18 at 09:49
1

@HenrikHeino set it in the munin master config not in the node config, for debian (ubuntu etc.) its /etc/munin/munin.conf (You could also create your own config under /etc/munin/munin-conf.d/example.conf) keep in mind that you have to change "centos_home" and "centos_swap" to your needs. – H4R0 Oct 20 '18 at 11:42
@H4R0 Do you know if it's possible to do this change on the monitored machine? We have a situation where admin of monitored server might not have permissions to the Munin master server. – Henrik Heino Oct 22 '18 at 05:47
@HenrikHeino What do you want to accomplish, do you want to hide the warnings for a single drive or for all ? Maybe it works if you create the munin.conf on the node otherwise you could always change the default values within the plugin on the node its located under /etc/munin/plugins/diskstats:890 "avgrdwait.warning 0:3" and :896 "avgwrwait.warning 0:3" – H4R0 Oct 27 '18 at 19:26
@H4R0 We have several servers, and on some servers on some mount points the disk latency warning limits should be increased. It would also be nice if I could modify the limit by creating a file to `/etc/munin/plugin-conf.d/`. That would be easy to document and maintain. Modifying plugin files that come with the operating system sounds like a bad idea IMHO. I was able to modify limits easily with disk usage plugin, so I wonder why it's so hard with disk latency plugin. – Henrik Heino Oct 29 '18 at 06:04
@HenrikHeino i tried several options. You can fetch the node configuration using 'echo "config diskstats" | nc localhost 4949 | less' and check if the config got applied but had no success – H4R0 Oct 29 '18 at 20:18
@H4R0 - How do I know / work out what to change 'centos_home' to (on Ubuntu)? – Vaughany May 17 '19 at 11:03
1

@Vaughany its the partition name, it should be in the alert mail, on the generated web page and can be listed with the config command. Execute the following: echo "config diskstats" | nc localhost 4949 | grep "avgwait" | less – H4R0 May 17 '19 at 13:09

Munin Disk Latency Alerts

1 Answers1