Why doesn't keepalived track_script restart keepalived when HAProxy drops out?

Question

UPDATE: seems to need a custom track script to actually failover and restart when HAProxy dies. Posted as answer.

I have keepalived (plus VIP) + haproxy + galera_node (on the same host, config below) setup using the same track script found in dozens of examples around the internet. What I don't get is when I kill the haproxy process running on a given node, this is what shows up in /var/log/syslog:

Keepalived_vrrp[29230]: VRRP_Instance(250) Entering MASTER STATE
Keepalived_vrrp[29230]: VRRP_Script(check_haproxy) failed

Makes sense. It did fail. But the odd thing is keepalived is neither relinquishing the VIP nor entering BACKUP state which is what the desired behavior is (below, VIP is still there several minutes after haproxy left). Am I misunderstanding how keepalived is designed to work, or is there some other apparent error in my config? (KA version is 1.2.7 on Ubuntu 14.04)

# ip a
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 0e:02:c6:83:82:74 brd ff:ff:ff:ff:ff:ff 
    inet 10.10.10.202/24 brd 10.20.18.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 10.10.10.250/32 scope global eth0
       valid_lft forever preferred_lft forever

keepalived.conf

global_defs
{
  router_id    mynode
}

vrrp_script check_haproxy
{
   script      "killall -0 haproxy"
   interval    1
   fall        2
   weight      2
}

vrrp_instance 200
{
   virtual_router_id 200
   advert_int   1
   nopreempt
   priority     90
   state        BACKUP
   interface    eth0
   notify       /etc/keepalived/log_status.sh

   virtual_ipaddress
   {
     10.10.10.250 dev   eth0
   }

   track_script
   {
     check_haproxy
   }
}

Are you just trying to monitor the `haproxy` service on a single node and restart it on failure? — kaizenCoder, May 17 '19 at 02:23
It's been awhile, (and this project has since been shelved), but IIRC, when the HAProxy process was killed, the VIP was still sticking to that node. I wanted the VIP to be taken over by one of the other 2 galera nodes. It was like `keepalive` didn't know HAProxy gone so the VIP wouldn't switch. What did solve it (for this specific stack and version anyway) was the script in my answer below.. I should probably accept it so it doesn't show up in the "unanswered" list. — Server Fault, May 17 '19 at 14:34

score 0 · Accepted Answer · answered Feb 02 '18 at 15:30

I'm using a custom track_script now which fixes this. It restarts keepalived if haproxy isn't running. Seems that keepalived only enters BACKUP mode using the default track script on startup/restart, not while it's running ¯\_(ツ)_/¯

#!/bin/sh
# restart keepalive if haproxy died
pid=`/bin/pidof haproxy`
test -z "$pid" && { service keepalived restart &>/dev/null; exit 1; }
exit 0

Why doesn't keepalived track_script restart keepalived when HAProxy drops out?

1 Answers1