Chat Server Load Balancing

Question

I've read the document on clustering and load balancing from Apache for Tomcat Servers, but I'm confused on how that would work for communication.

Let's say I'm creating a chat app that allows users to talk to one another through the server. If two users are on the same server, that's great, but what if one user is on one server and the other is on another? How would the servers communicate?

I guess my point is that I would be using multiple servers to reduce load, but if users communicate via the server and each user is on a separate server, then my two servers would become each other's client and the load would not decrease.

My point is that it's the same amount of data into/out of each server, so how does that work when there are 1 million users?

Im also searching for something like this. Chats servers need prsence sharing and a load balancing like we use for web pages will not help. Is there any framework for chat/presence server clustering? — Chinta, Dec 12 '13 at 09:29

score 4 · Answer 1 · answered Apr 03 '14 at 02:13

You're really looking at two different requirements:

Load balancing: expose a single network address for multiple Web (or other protocol) servers.
Communicating state (the messages) between multiple servers.

The first requirement is straightforward: use a hardware or software load balancer, or use a single Apache web server in front of multiple Java servers.

The second requirement is the issue.

Let's think about a hypothetical chat server on a single system. When a message is received, the request is parsed, and the new message is placed in memory for the recipient. There will need to be handling of common situations: user logs off in the middle of a session, for example. You will also need to figure out how to pass the received messages back to the users' browsers. The browser can poll ("send me all messages after #N for user X") or the server may be able to push the received messages using one of several techniques. If you have a chat server running on top of a Web server, this should all be familiar.

The sticky part is: how do you do this over multiple machines? Off the top of my head, I can think of a couple of ways that will scale OK:

Keep track of which server the recipient is on. Use another transport mechanism to send the message to that server so it can be shoved into memory as if the sender had been local. See "message queuing" or "enterprise service bus."
Decouple message handling from communication: designate one or more servers for the active conversations. Have the recipient server send a message to those servers; use a notification mechanism (good) or polling (not so good) to alert recipient servers that there's a chat message waiting to be sent out. Special feature: use a distributed hash table to distribute message mailboxes to the pool of servers; if one or more servers fail, the DHT can automatically adjust.
Use broadcast: each server broadcasts to all other servers if the recipient is not local. Every server receives the notification; only the one with the recipient does anything with it.

The key here is that you can no longer make use of shared memory between multiple machines. You have to use one of several possible mechanisms to move the message between servers. You're unlikely to use a general-purpose, relatively high overhead protocol (like HTTP) for this; there are lots of good tools that are more efficient, and you can implement this at several levels of abstraction, from using a shared cache tool like Terracotta, a peer-to-peer network protocol like JXTA, an enterprise service bus like ActiveMQ, etc. Depending on how much you want to put on the user's browser, you can even run some message queuing software directly on the client system -- the notification there's a new message can go directly to the user instead of to an intermediate mailbox.

The clear optimization is to support a mechanism to move user with active conversations to the same server, but that won't work with most load balancing mechanisms. There ought to be some way to force affinity between a particular server for a user, but I can't think of an easy one.

score 0 · Answer 2 · answered Apr 02 '14 at 09:28

You can use a load balancer to achieve what you need.

Basically, a server load balancer can be:

Transport-level load balancer - a DNS-based approach or TCP/IP-level load balancing
Application-level load balancer - uses the application payload to make load balancing decisions.

The basic server load balancer would look like this:

enter image description here

Basically, every server informs the load balancer about its current connection count. The load balancer uses this information to decide to which server to send the new client.

My point is that it's the same amount of data into/out of each server, so how does that work when there are 1 million users?

Actually its not the same amount of data if you consider you don't need to broadcast every information to every client, you can optimize this dynamically and drop the load significantly.

For more details about Tomcat implementation, refer to the following links:

Chat Server Load Balancing

2 Answers2