Instant Messaging Server Design

Question

Let's suppose we have an instant messaging application, client-server based, not p2p. The actual protocol doesn't matter, what matters is the server architecture. The said server can be coded to operate in single-threaded, non-parallel mode using non-blocking sockets, which by definition allow us to perform operations like read-write effectively immediately (or instantly). This very feature of non-blocking sockets allows us to use some sort of select/poll function at the very core of the server and waste next to no time in the actual socket read/write operations, but rather to spend time processing all this information. Properly coded, this can be very fast, as far as I understand. But there is the second approach, and that is to multithread aggressively, creating a new thread (obviously using some sort of thread pool, because that very operation can be (very) slow on some platforms and under some circumstances), and letting those threads to work in parallel, while the main background thread handles accept() and stuff. I've seen this approach explained in various places over the Net, so it obviously does exist.

Now the question is, if we have non-blocking sockets, and immediate read/write operations, and a simple, easily coded design, why does the second variant even exist? What problems are we trying to overcome with the second design, i.e. threads? AFAIK those are usually used to work around some slow and possibly blocking operations, but no such operations seem to be present there!

nos · Accepted Answer · 2011-01-13T19:40:46.937

I'm assuming you're not talking about having a thread per client as such a design is usually for completely diffreent reasons, but rather a pool of threads each handles several concurrent clients.

The reason for that arcitecture vs a single threaded server is simply to take advantage of multiple processors. You're doing more work than simply I/O. You have to parse the messages, do various work, maybe even run some more heavyweight crypto algorithms. All this takes CPU. If you want to scale, taking advantage of multiple processors will allow you to scale even more, and/or keep the latency even lower per client.

Some of the gain in such a design can be a bit offset by the fact you might need more locking in a multithreaded environment, but done right, and certainly depening on what you're doing, it can be a huge win - at the expense of more complexity.

Also, this might help overcome OS limitations . The I/O paths in the kernel might get more distributed among the procesors. Not all operating systems might fully be able to thread the IO from a single threaded applications. Back in the old days there were'nt all the great alternatives to the old *nix select(), which usually had a filedesciptor limit of 1024, and similar APIs severly started degrading once you told it to monitor too many socket. Spreading all those clients on multiple threads or processes helped overcome that limit.

As for a 1:1 mapping between threads, there's several reasons to implement that architecture:

Easier programming model, which might lead to less hard to find bugs, and faster to implement.
Support blocking APIs. These are all over the place. Having a thread handle many/all of the clients and then go on to do a blocking call to a database is going to stall everyone. Even reading files can block your application, and you usually can't monitor regular file handles/descriptors for IO events - or when you can, the programming model is often exceptionally complicated.

The drawback here is it won't scale, atleast not with the most widely used languages/framework. Having thousands of native threads will hurt performance. Though some languages provides a much more lightweight approach here, such as Erlang and to some extent Go.

Actually, I was referring to EXACTLY that 1:1 client:thread mapping design. The hybrid design you mentioned does make sense, and I actually am familiar with it as well. Could you please elaborate a bit more on the 1:1 sort of design? — iksemyonov, Jan 13 '11 at 18:30
Thanks!!! I might only add "Win32 fibers" here, no? That's the only lightweight (as in really lightweight) thread API out there I know. But in general, yes, what you say makes sense. Just how do you avoid blocking database calls (that's the most obvious thing that immediately comes to mind within the context of this discussion and moreover threads) in the single-thread model? — iksemyonov, Jan 13 '11 at 19:53
fibers might be lightweight, but they don't save you if you want to do a blocking call. Avoiding blocking calls is usually practically impossible, so that's why the 1:1 model is so common. Some DB APIs does provide non-blocking calls, most don't. In the latter case you can implement an async pattern, and schedule the blocking call on a (pool of) thread and get notified when it completes. This means you'll map many clients onto (fewer ) threads handling the blocking calls, whether that works well depends on the rate of blocking calls and the average time each takes. — nos, Jan 13 '11 at 21:58

Instant Messaging Server Design

1 Answers1