TCP Load balancing

Question

I need to implement an architecture that can serve 3 million GPS Devices sending Location Updates/Alerts (every 10 secs) to our system for processing.

Features: 1) TCP communication. 1) Long lived connections(~12 hrs a day). Change only when their IP Address gets changed. 2) GPRS communication. 3) Simple parsing of data and storage in the database.

Currently we have a basic system (Active-Pasive) handling ~50K connection on a single server using Netty Framework for Java NIO.

I thought about increasing the number of server nodes say one for every ~100k connections. The problem is i can only have few public IPs or a single hostname for these clients to connect so i need a Proxy to manage the requests.

Can Haproxy manage a load of 3 million tcp sessions in some configuration or i need Hardware load balancer or a combination of both.

Also is netty a good choice or i can handle more load per application server via some other framework/technology.

Can you rework the broken architecture that keeps a TCP connection alive when it does not need it? — TomTom, Nov 19 '14 at 10:46
@TomTom are you talking about states like TIME_WAIT, CLOSE_WAIT? — gladiator, Nov 19 '14 at 10:47
No,. I talk about keeping a TCP connection open for something every 3 seconds. This is the issue that actually creates the whole mess to start with. — TomTom, Nov 19 '14 at 10:48
@TomTom There is nothing broken about using long lived TCP connections. TCP is designed for connections to live potentially indefinitely. — kasperd, Nov 19 '14 at 10:55
How many TCP connections a solution can scale to depends more on the design of the solution than whether the specific implementation is done in hardware or software. I don't know the design details of the two specific solutions you have in mind, so I cannot say how far each can scale. But I do know how one could design a solution, that could scale to millions of TCP connections on a single IP without needing specialized load balancing hardware. — kasperd, Nov 19 '14 at 11:03
One IP can be anycast to a pool of machines handling incoming packets. This pool can share a distributed hash table mapping client IP/port to which backend was chosen (choice of backend is done by the machine handling the SYN packet). All incoming packets are tunneled by whatever machine receives it to the proper backend. Return traffic is sent directly from the backend to the client (DSR). — kasperd, Nov 19 '14 at 11:13
@kasperd can you provide an answer explaining this architecture "know how one could design a solution, that could scale to millions of TCP connections on a single IP without needing specialized load balancing hardware." thanks in advance. — gladiator, Nov 19 '14 at 11:26

score 0 · Answer 1 · answered Nov 19 '14 at 12:30

I can't speak for the specific designs of either of the two solutions you are considering. I can however explain a design, which could be implemented in software and could scale to millions of TCP connections on a single IP if deployed on a sufficient number of standard machines.

First of all the IP address can be anycast to a pool of machines responsible for handling incoming packets.

In order for this to work it must be possible to communicate between these machines. So each must have a unicast IP address, since those unicast IP addresses are not used to communicate with your clients they can be RFC 1918 addresses or IPv6, such that you have enough addresses for communication between your machines.

When a packet is received, the client IP and port is looked up in a table on the machine that received the packet. If it is found the table entry indicates which backend the packet is to be handled by. The receiving machine will then encapsulate the packet in a tunnel to that backend.

If no entry was found, then the client IP and port is hashed in order to produce two (for redundancy) indexes into a distributed hash table. All the machines must use the same hash function in order for this to work.

If the packet was a SYN packet, then the receiving machine pick a backend and send the information to the two machines chosen by the hash as well as store it in its own table. This happens in parallel with forwarding the packet to the backend.

If the packet was not a SYN packet, the packet is stored on the receiving machine while asking the two machines (in parallel) which backend should process it. Once the first reply comes back indicating where the packet is to be sent, the packet is forwarded to the backend, and the mapping to a backend is stored in the local table and sent to the other of the two machines picked by the hash.

The backend must be configured such that the public IP is assigned to a dummy interface (Linux has a dummy network driver for this sort of purpose). Interfaces that are actually used to route packets through must have unicast addresses. This way the TCP stack on the backend will happily accept TCP connections to the public address, but it won't use it for connections that it initiated itself.

Replies from the backend are simply send with the public IP as source address without going through the anycast pool (this approach is known as Direct Server Return or DSR).

TCP Load balancing

1 Answers1