1

I'm currently trying to setup a load balancer for a bunch of download mirrors. While reading into this subject I saw Nginx lends itself perfectly as a load balancer, great! But when looking at the different configurations, I got kind of confused.

One can decide to redirect or proxy to the back-end servers. Redirect is pretty clear, you tell the client to go somewhere else instead and the request is passed on and handled, the load balancer is out of the picture.

However, if you choose to use a proxy, doesn't that basically cripple the whole idea of having multiple download mirrors running? Given nginx will forward the request to the backend server, download the file and pass that on to the client?

So to visualize how I think it works (Stream of packets):

Redirect: Client => Load Balancer => Backend => Client

Proxy: Client => Load Balancer => Backend => Load Balancer => Client

Or will the proxy do some magic and tell the client to actually connect to the backend to download his file?

In case proxying indeed kind of defeats the purpose of having multiple download mirrors in order get more throughput, is redirect the only alternative?

EDIT: Or am I confusing the workings of a proxy with that of a rewrite? Does a proxy actually pass on the request like a redirect does while still using the same URL?

  • The answer to the redirect question you linked will actually tell the client to download the file directly from box1X.example.com. – fholzer May 06 '15 at 08:11
  • So, I was correct that a proxy will only cripple the throughput? – Lennard Fonteijn May 06 '15 at 10:11
  • With the redierct scenario, no it won't affect throughput. It will result in overheader in the sense that the client will make 2 request. 1) to nginx, which will tell the client to go look for the file in a different place. 2) the actual file download from box1X.example.com – fholzer May 06 '15 at 10:48

1 Answers1

4

In case you use nginx as a load-balancer, the stream will be :

Redirect :

Step 1 : Client => LB      (HTTP request) 
Step 2 : LB => Client      (HTTP reply)
Step 3 : Client => Backend (HTTP request)
Step 4 : Backend => Client (HTTP reply)

Proxy :

Step 1 : Client => LB      (HTTP request) 
Step 2 : LB => Backend     (HTTP request) 
Step 3 : Backend => LB     (HTTP reply) 
Step 4 : LB => Client      (HTTP reply) 

So in the first case the load balancer the way you think it is a simple HTTP server and backends reply directly to the client. In the second case it goes all the way back to the client through nginx. Since nginx won't necessarily wait for the full reply body to be available to start transferring data to the client, it uses buffers or temporary files to stream it back depending on the configuration. But, you will encounter higher packet round trip times since you get one more hop during actual data transfers.

So that's the big picture for OSI layer 7 load-balancing in case HTTP is in use. Now, network load-balancing is not limited to layer 7 and HTTP. There are other ways.

Particularly, if you are seeking for a way to spread traffic to backend servers hosting essentially static content, then you can instead use keepalived as your load balancing solution in Direct Routing mode which will make the backend servers reply directly to the client, while requests come through the load balancer (it's OSI layer 4 so it doesn't know whatever you are doing upwards it simply mounts a virtual IP and push the TCP stream to actual servers where the same virtual IP is mounted on the loopback interface). Keepalived also handle HA by using VRRP (master/backup model).

If you absolutely want to stick to nginx, there's something "similar" which is called the stream module (appeared in nginx 1.9.0, not a "stable" release) but you will need recompiling it yourself and this won't prevent the hop back even through working at OSI layer 4 too.

Xavier Lucas
  • 13,095
  • 2
  • 44
  • 50
  • 3
    Thank you for confirming my suspicion. Since I don't care how the URL looks, as long as the client actually downloads directly from the mirror with no middle-man like a load balancer, I'm happy. The download mirrors are used by an application after communicating with an API. I could technically add some form of load balancing to the API and make it aware off all available mirrors. I can then do every trick in the book, round-robin, check for health, whatever, before dispatching the request. – Lennard Fonteijn May 06 '15 at 18:38