0

I'm running 2 instances of HAProxy 1.6.9 with PCS on CentOS 7.

HAProxy on my main node suddenly failed during the weekend. A systemctl status of haproxy gave us a failed status. Restarting haproxy worked just fine and the service came back again.

I'm now trying to understand what went wrong and to make sure that it didn't happen again.

That's the result of journalctl:

[root@main-lb ~]# journalctl --utc -u haproxy --since yesterday
-- Logs begin at Thu 2017-07-06 05:39:42 GMT, end at Sun 2017-08-20 17:36:05 GMT. --
Aug 20 07:20:01 main-lb systemd[1]: Reloaded SYSV: HA-Proxy is a TCP/HTTP reverse proxy which is particularly suited for high availability environments..
Aug 20 07:20:01 main-lb systemd[1]: haproxy.service: main process exited, code=killed, status=9/KILL
Aug 20 07:20:01 main-lb haproxy[14935]: Shutting down haproxy: [FAILED]
Aug 20 07:20:01 main-lb systemd[1]: Unit haproxy.service entered failed state.
Aug 20 07:20:01 main-lb systemd[1]: haproxy.service failed.

I can see a Reloaded SYSV and I'm wondering why it has been triggered and also I'm guessing it's the part of the issue here.

maxime_039
  • 243
  • 5
  • 14
  • have you try to check your cluster log? – c4f4t0r Aug 20 '17 at 17:52
  • The cluster looks fine. I didn't saw any errors or warning from `/var/log/messages`. And it would not impact HAProxy anyway I guess? – maxime_039 Aug 20 '17 at 18:04
  • if you are using centos 7 with pacemaker, the cluster logs aren't stored in /var/log/messages – c4f4t0r Aug 20 '17 at 18:30
  • @c4f4t0r yes my mistake. I couldn't find anything specific in the log that could explain the issue. I got some warning/info/notice that are recurrent on my log but nothing unusual and nothing at 07:20:01 UTC. And I still don't see how the clustering could impact haproxy as it's separate services. – maxime_039 Aug 21 '17 at 08:31

1 Answers1

0

I was able to gather more logs at the time that the issue occurred and found the following:

Aug 20 07:20:01 main-lb anacron[14184]: Job `cron.daily' started
Aug 20 07:20:01 main-lb run-parts(/etc/cron.daily)[14912]: starting logrotate

So logrotate was started just before haproxy failed. And looking at my logrotate script for haproxy.log (/etc/logrotate.d/haproxy), I found the following:

/etc/init.d/haproxy reload > /dev/null

So haproxy was reloaded after the logrotation. However, it seems that there is a known issue with that, as explained on the following topic: https://discourse.haproxy.org/t/v1-6-10-soft-reload-not-working-under-centos-7-2/994

What I have done is removing the reload command from logrotate and restarting rsyslog.

maxime_039
  • 243
  • 5
  • 14