0

I can't seem to get watchdog's 'max-load-*' features working on Ubuntu 14.04 (in a Vagrant VM w/ Virtualbox).

From my understanding, it should be fairly straight forward. Here's how I'm testing it.

  • I've set 'max-load-1' to 0.5 for testing purposes (and added -v -f to the start params) [1]
  • Restart the watchdog service (service watchdog restart)
  • Triggered burnP6 to simulate a heavily loaded server (the test VM only has one core, so it spikes pretty fast)

The expected result (to me) would be that the box would reboot after load average reaches 0.5, but it just goes on plateaus at around 1.

I've also experimented with the 'ping' feature, and that one works just like a charm. If I set it, and then removes the default gateway, the VM reboots as expected.

I've also tried adding the 'softdog' module and explicitly set 'watchdog-device = /dev/watchdog', but it doesn't seem to make any difference.

Am I missing something?

[1] watchdog.conf: https://gist.github.com/vpetersson/8487f45826216f556e89

vpetersson
  • 861
  • 1
  • 11
  • 22
  • Did you enable the watchdog in the kernel? Ex. `CONFIG_WATCHDOG=y` and pals. That is what actually reboots the server - the watchdog daemon just periodically tells the kernel watchdog everything is ok. – Brian Jul 19 '15 at 13:46
  • Never re-compiled the kernel (stock Ubuntu kernel), but since it worked with the `ping` functionality, I'd say it's safe to assume that it is working on a kernel level. Also, I have the 'softdog' module loaded, and '/dev/watchdog' is present. – vpetersson Jul 19 '15 at 13:48
  • It might be seeing the leading 0 and turning it off. Would be a bug but very few would ever want to reboot a server for a load average less than 1. Try a higher value. – Brian Jul 19 '15 at 13:54
  • Another reference lists the max-load-* options as integers. So a 0.5 would likely end up as 0 disabling the check. – Brian Jul 19 '15 at 13:59
  • That's a fair point, but unfortunately doesn't seem to do the trick either. https://gist.github.com/vpetersson/ec59baaa583e7e25e86b – vpetersson Jul 19 '15 at 13:59
  • Does the server reboot if the watchdog daemon is killed off? `pkill -9 watchdog` – Brian Jul 19 '15 at 14:09
  • Yup, if i `pkill -9 watchdog` the VM does reboot: https://gist.github.com/vpetersson/dfd17083c39fb37ad80b – vpetersson Jul 20 '15 at 08:49

0 Answers0