1

I'm currently testing haproxy to load balance our newsletter generation. We build personalized newsletter for our customers.

To do this we use two webservers (identical machines), and one "mail engine". The mail engine makes calls to the webservers, which then returns a personalized html newsletter.

Now the problem, is that one webserver has a cpu load about 75% but the other is only running 15%. Looking at "Session rate" when testing, both server has "Session rate -> Cur" between 3 and 4 the whole time.

But when looking at "Sessions", here the "Sessions -> Cur" has a total of 10, web server 1 has a "Cur" of 8 and the other web server has between 0 and 2.

Why would there be 8 session on the first web server and 0-2 on the other?

Here is my config:

defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        retries 3
        option redispatch
        maxconn 2000
        timeout connect 5000
        timeout client  50000
        timeout server  50000

listen mailgenerator 10.46.70.75:80
        mode http
        stats enable
        balance roundrobin
        option httpclose
        option forwardfor
        option httpchk HEAD /robots.txt
        server mail1 192.168.70.11:80 check weight 100
        server mail2 192.168.70.12:80 check weight 100
dmourati
  • 25,540
  • 2
  • 42
  • 72
Hans Hersbak
  • 11
  • 1
  • 2

2 Answers2

2

If you use round robin then connections will be distributed evenly as they arrive regardless of load. So if you have a slower server or a slow process it can build up a queue on one while the other is free.

You can get much more even distribution if you use the leastconn balancing and set a low maxconn per server to cause connections to queue in haproxy rather than on each server.

JamesRyan
  • 8,166
  • 2
  • 25
  • 36
  • For short, stateless connections, using roundrobin and varying the server weights is still better than leastconn when you need to balance asymmetrically. Leastconn works better for long lived tcp connections, where the number of corrent sessions is more relevant. – JimB Jun 17 '11 at 15:05
  • No because when you get one stalled or slow connection it causes problems as demonstrated in the Q. And with fast or few connections leastconn ends up alternating in the same was as roundrobin anyway. – JamesRyan Jun 17 '11 at 15:26
  • @JamesRyan - hmm, I was just restating a recommendation from the author of haproxy, but I do agree (and usually use leastconn myself when there's any variability in session time). I wonder if there really is a drawback to leastconn with a high session rate... – JimB Jun 17 '11 at 16:02
  • Another point to ponder, depending on the application the number of connections does not necessarily correlate to load. In situations where session setup is expensive, you want to distribute new connections evenly, not based on session that haven't closed. – JimB Jun 17 '11 at 16:10
  • I think this would sort itself out though because on servers where session setup took longer the queue would grow quicker. – JamesRyan Jun 17 '11 at 16:24
  • To reply to my own comment, leastconn doesn't account for weight when balancing equally connected servers. It also can't account for session affinity (an unloaded server could have many persistent clients that aren't connected at that moment), and therefor may not distribute users evenly. – JimB Jun 17 '11 at 18:49
  • If you set a low maxconn your server weighting is not required because the speed that servers deal with their queue is already accounted for. You will get the affinity problem no matter which balancing method you use. – JamesRyan Jun 17 '11 at 20:34
  • @JamesRyan - under low load conditions (e.g. 0 or 1 connections), leastonn won't distribute load evenly, and often hits only 1 server. – JimB Jun 17 '11 at 20:36
  • As I said before the aim is not to work your servers evenly, it is to answer requests. If single connections are going to the same server all the time then it doesn't matter because 1 server can handle that low load all by itself! – JamesRyan Jun 17 '11 at 20:40
  • If you rely on roundrobin so that you can use sticky sessions you will find that in practice the distribution does not end up even at all. You may as well be using round robin DNS than add the complexity of a load balancer. Not only that but when a problem does occur some of your users are stuck to the slow server. It's far better to have some sort of shared session and balance properly. – JamesRyan Jun 17 '11 at 20:46
  • @JamesRyan- if you're trying to distribute client sessions, roundrobin is the best way to ensure even distribution. Since affinity is decided by first connection, you want to distribute those initial sessions throughout the backend pool. This isn't relevant though to the OP, since he's not using any session affinity. – JimB Jun 17 '11 at 20:46
  • In practice this is not the case. Because some sessions are long and others are short it is entirely random which server they build up on. With leastconn if one server is busier then more initial connections get sent to the quieter server, it tends towards a more even distribution. It would only work the way you suggest if people first visit while it is quiet and then all pile on at once, with most session timeouts being relatively short the chances of that are slim. – JamesRyan Jun 17 '11 at 20:53
  • @JamesRyan - I think we're just coming from different backgrounds here ;) I tend to use haproxy in complex, stateful scenarios (e.g. anything *but* plain http). Your recommendation aren't bad, there's just other use cases that fall into the exception. Without real numbers we're both taking out our a$$es, but my experience has mirrored the recommendation of the haproxy author and documentation. – JimB Jun 17 '11 at 21:00
  • It seems to me that you are just quoting a single comment that happened to be the top of a google search. My experience of both balancing methods explains the OPs problem. – JamesRyan Jun 18 '11 at 12:08
  • really? the implied ad hominem is uncalled for, and your experience doesn't cover all cases. – JimB Jun 19 '11 at 00:30
  • calling me an ass because you talked yourself into a hole was uncalled for. The obvious mistakes/assumptions you have made in your comments and your own answer to this question show that you could have learned something if you wern't so busy argueing. – JamesRyan Jun 19 '11 at 10:51
  • I didn't call you an ass, I said that we both shouldn't be making claims without real numbers. I now have a test cluster where one server gets about 50% more load when using leastconn over roundrobin. As I said before, roundrobin is the only way to evenly distribute long sessions requiring affinity. This of course isn't relevant to the OP, so I say we take call a truce and end this, and probably take these comments off here, instead of polluting this question. – JimB Jun 19 '11 at 15:12
0

For http, the current sessions doesn't mean much. The sessions Total and LbTot better represent how the servers are being balanced. If those numbers are fairly even, it may be something on one server causing it to process its requests more slowly, therefor pushing up the load.

JimB
  • 1,924
  • 12
  • 15
  • The point of balancing is not to ensure that your servers get an even amount of the work. Even distribution is only important under load and then is secondary to making sure requests are fulfilled. If one server is slow or stalled you WANT more requests to go to the other one. – JamesRyan Jun 17 '11 at 15:29
  • @JamesRyan - Agreed. The OP though said he has identical machines fielding identical requests. I was trying to show that the load balancing may still be even, and skewed load cause by another problem (Maybe my answer should have been comment). You're trying to solve the problem (which is good), where as I'm trying to diagnose the symptoms. – JimB Jun 17 '11 at 15:57