3

I'm not managing to get more that 11K requests per second per haproxy instance.

I have two haproxy instances on amazon EC2. Both under a c4.xlarge instance. I tried to configure the maxconn parameter, the cpu mapping and linux limit without any luck.

I'm using jmeter to make the tests and if I run two parallel jmeter configured to attack one of the haproxy each I mannage to get about 22K, but if I execute the same configuration but both attacking only 1 haproxy instance the maximum throughput is 11K.

My haproxy configuration is:

global
    nbproc 4
    cpu-map 1 0
    cpu-map 2 1
    cpu-map 3 2
    cpu-map 4 3
    maxconn 150000
    log /dev/log    local0
    log /dev/log    local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

    # Default SSL material locations
    ca-base /etc/ssl/certs
    crt-base /etc/ssl/private

    # Default ciphers to use on SSL-enabled listening sockets.
    # For more information, see ciphers(1SSL). This list is from:
    #  https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
    # An alternative list with additional directives can be obtained from
    #  https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy
    ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
    ssl-default-bind-options no-sslv3

defaults
    log global
    mode    http
    option  httplog
    option  dontlognull
    option      http-server-close
    retries     2
    timeout     http-request    10s
    timeout     queue           1m
    timeout     connect         5s
    timeout     client          1m
    timeout     server          1m
    timeout     http-keep-alive 20s
    timeout     check           15s
    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 403 /etc/haproxy/errors/403.http
    errorfile 408 /etc/haproxy/errors/408.http
    errorfile 500 /etc/haproxy/errors/500.http
    errorfile 502 /etc/haproxy/errors/502.http
    errorfile 503 /etc/haproxy/errors/503.http
    errorfile 504 /etc/haproxy/errors/504.http



frontend DSP_FRONT
    bind *:80
    maxconn 300000
    default_backend DSP_BACK

backend DSP_BACK
    balance hdr(device)
    mode http
    server dsp1 172.31.3.141:80 check
    server dsp2 172.31.8.195:80 check
    server dsp3 172.31.8.186:80 check


listen stats 
    bind :9000
    mode http
    stats enable
    stats hide-version
    stats realm HAproxy-Statistics
    stats uri /haproxy_stats

The backend should be very fast and the response length is quite small (0.5-1kb).

Also I tried to mess with the system limits.

fs.file-max = 10000000 
fs.nr_open = 10000000
net.ipv4.tcp_mem = 786432 1697152 1945728
net.ipv4.tcp_rmem = 4096 4096 16777216
net.ipv4.tcp_wmem = 4096 4096 16777216
net.ipv4.ip_local_port_range = 1000 65535

And added the file limit to the haproxy systemd service as well

LimitNOFILE=300000

But seems to be no change.

I'm running haproxy under ubuntu 16.04

L.E:

Output of cat /proc/[haproxyprocid]/limits

ubuntu@ip-172-31-1-115:~$ ps ax| grep ha
 1214 ?        Ss     0:00 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid
 1217 ?        S      0:00 /usr/sbin/haproxy-master
 1218 ?        Ss     0:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
 1219 ?        Ss     0:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
 1220 ?        Ss     0:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
 1221 ?        Ss     0:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
 1393 pts/0    S+     0:00 grep --color=auto ha
ubuntu@ip-172-31-1-115:~$ cat /proc/1217/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             29852                29852                processes 
Max open files            300035               300035               files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       29852                29852                signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us  
Sinjuice
  • 131
  • 1
  • 7
  • Just to be sure please post the output of `cat /proc//limits` – 13dimitar Jun 08 '17 at 15:08
  • which one? the master or any child? – Sinjuice Jun 08 '17 at 15:09
  • The master pid will do – 13dimitar Jun 08 '17 at 15:11
  • I've add it to the post. – Sinjuice Jun 08 '17 at 15:12
  • I guess I need to know why you need to surpass 11k RPS on a single instance. At Stack Overflow we're only doing 5k RPS over two haproxy instances. By the time you're doing 10k RPS you're probably looking at multiple LB's and ECMP (which you won't be doing in the cloud). – Mark Henderson Jun 08 '17 at 15:38
  • Well, now we are receiving about 50K RPS on other servers(we are using ELB), those servers will make a requests to this new service per each request. So it will require managing that amount of requests per second. Also I find 10k RPS per instance a little low since the CPU is about 40-50% on the haproxy instance and I know on tcp mode is able to handle way more requests. I was expecting at least 30-50K per instance. – Sinjuice Jun 08 '17 at 15:51
  • @MarkHenderson Looks like it has to do with the instance capacity. I moved from a c4.xlarge to a m3.medium(costs 3 times less) and achieved about 5.2K RPS. Looks like Haproxy works better on single CPU. – Sinjuice Jun 08 '17 at 16:18
  • This is just my experience, but I had to recompile haproxy to get optimal performance on VM's. In my case, I had to use `TARGET=custom CPU=generic`. Prior to that, my CPU usage was really high and performance was really low. What target/cpu profile was used in your build? You can see this with `haproxy -vv` Haproxy can use any number of CPU's, but you have to map them in the config using `cpu-map` after setting `nbproc`, (which you did) – Aaron Jun 08 '17 at 17:43

1 Answers1

1

You don't mention what is the limiting factor. First you're running on shared VMs so only the hosting provider knows whether they deliver you real CPU or not. Second, it's possible that you're maxing out CPU if you're stressing SSL. 11k req/s could more or less match what to expect from resumed TLS connections on a moderate machine. In this case you'll see 100% CPU used by haproxy, mostly in userland (typically 60% user / 40%sys). If you're doing 11k RSA2048, then congrats as it's huge! If you're doing it on clear connections, it's low but could be entirely caused by the VM environment. If it's on keep-alive connections, it really is too low and could even be caused by huge network latency (also common with extremely overbooked VMs).

Willy Tarreau
  • 3,896
  • 1
  • 20
  • 12
  • Well, I know VM is a limiting factor, but on a instance of the same type using tomcat + spring boot I can process ~same amount of requests per second (using jmeter attacking directly one backend) as haproxy can balance. I have to say java is using >100% of cpu, but I was expecting at least 100% from haproxy and I'm getting 60%. Also I'm not using ssl, as you see the frontend is on port 80. – Sinjuice Jun 09 '17 at 18:56
  • So what you've found is your VM's limit in terms of req/s, and that this limit is so low that it cannot even deliver enough work to haproxy to make it use more CPU. So in short, to achieve the same level of performance, you can remove some CPUs or use them to do extra work. – Willy Tarreau Jul 13 '17 at 09:34