17

We've setup 3 servers:

  • Server A with Nginx + HAproxy to perform load balancing
  • backend server B
  • backend server C

Here is our /etc/haproxy/haproxy.cfg:

global
        log /dev/log   local0
        log 127.0.0.1   local1 notice
        maxconn 40096
        user haproxy
        group haproxy
        daemon

defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        retries 3
        option redispatch
        maxconn 2000
        contimeout      50000
        clitimeout      50000
        srvtimeout      50000
                stats enable
                stats uri /lb?stats
                stats realm Haproxy\ Statistics
                stats auth admin:admin
listen statslb :5054 # choose different names for the 2 nodes
        mode http
        stats enable
        stats hide-version
        stats realm Haproxy\ Statistics
        stats uri /
        stats auth admin:admin

listen  Server-A 0.0.0.0:80    
        mode http
        balance roundrobin
        cookie JSESSIONID prefix
        option httpchk HEAD /check.txt HTTP/1.0
        server  Server-B <server.ip>:80 cookie app1inst2 check inter 1000 rise 2 fall 2
        server  Server-C <server.ip>:80 cookie app1inst2 check inter 1000 rise 2 fall 3

All of the three servers have a good amount of RAM and CPU cores to handle requests

Random HTTP 503 errors are shown when browsing: 503 Service Unavailable - No server is available to handle this request.

And also on server's console:

Message from syslogd@server-a at Dec 21 18:27:20 ...
 haproxy[1650]: proxy Server-A has no server available!

Note that 90% times of the time there is no errors. These errors happens randomly.

slm
  • 15,396
  • 12
  • 109
  • 124
BnW
  • 592
  • 1
  • 3
  • 13

7 Answers7

30

I had the same issue. After days of pulling my hair out I found the issue.

I had two HAProxy instances running. One was a zombie that somehow never got killed during maybe an update or a haproxy restart. I noticed this when refreshing the /haproxy stats page and the PID would change between two different numbers. The page with one of the numbers had absurd connection stats. To confirm I did

netstat -tulpn | grep 80

Or

sudo lsof -i:80

and saw two haproxy processes listening to port 80.

To fix the issue I did a "kill xxxx" where xxxx is the pid with the suspicious statistics.

Quynh Nguyen
  • 2,959
  • 2
  • 13
  • 27
Matthew Jones
  • 944
  • 9
  • 17
9

Adding my answer here for anyone else who encounters this exact same problem but none of the listed solutions above are applicable. Please note that my answer does not apply to the original code listed above.

For anyone else who may have this problem, check your config and see if you might have mistakenly put the same "bind" line in multiple sections of your config. Haproxy does not check this during startup, and I plan to submit this as a recommended validation check to the developers. In my case, I have 3 different sections of the config, and I mistakenly put the same IP binding in two different places. It was about a 50/50 shot on whether or not the correct section would be used or the incorrect section was used. Even when the correct section was used, about half of the requests still got a 503.

CyberInferno
  • 231
  • 2
  • 2
  • 1
    You just saved my day ! This was exactly my problem !! :O Thanks a lot ! – Thomas Vuillaume Apr 10 '18 at 08:56
  • This is probably one of the more popular reasons. This saved me more headake after 3 days of research. Was using docker flow proxy that bundles haproxy and the config for services created haproxy config with duplicated binds that caused every other request to fail. I set env var -e DEFAULT_PORTS=81,444 and this caused that default ports and service ports weren't duplicated anymore and all works like a charm. Whew. – wholenewstrain Aug 31 '18 at 03:06
  • That's specially hard to find if you have some chaotic management around your infrastructure... just discovered someone messed with a configuration I had previously fixed, thanks a lot for your comment. – Matheus Mohr Jan 08 '19 at 11:59
  • Making this mistake is easy if you're new to haproxy, like I am. I would not have known that having multiple bind lines was wrong if I didn't hit this problem and didn't find this answer. – mhucka Feb 15 '21 at 21:32
1

It is possible your servers share, perhaps, a common resource that is timing out at certain times, and that your health check requests are being made at the same time (and thus pulling the backend servers out at the same time).

You can try using the HAProxy option spread-checks to randomize health checks.

Matt Beckman
  • 5,022
  • 4
  • 29
  • 42
1

I had the same issue, due to 2 HAProxy services running in the linux box, but with different name/pid/resources. Unless i stop the unwanted one, the required instances throws 503 error randomly, say 1 in 5 times.

Was trying to use single linux box for multiple URL routing but looks a limitation in haproxy or the config file of haproxy i have defined.

0

Hard to say without more details, but is it possible you are exceeding the configured maxconn for each backend? The Stats UI shows these stats on both the frontend and on individual backends.

0

I resolved my intermittent 503s with HAProxy by adding option http-server-close to backend. Looks like uWSGI (which is upstream) is not doing well with keep-alive. Not sure what's really behind the problem, but after adding this option, haven't seen single 503 since.

vvucetic
  • 479
  • 7
  • 15
0

don't use the "bind" line in multiple sections of your haproxy.cfg for example, this would be wrong

frontend stats
bind *:443 ssl crt /etc/ssl/certs/your.pem
frontend Main
bind *:443 ssl crt /etc/ssl/certs/your.pem

fix like this

frontend stats
bind *:8443 ssl crt /etc/ssl/certs/your.pem
frontend Main
bind *:443 ssl crt /etc/ssl/certs/your.pem
javidasd
  • 1,126
  • 13
  • 18