2

I'm working on a project using the TCP protocol that may have to work with many 100s or more connections at once.

As such, I am uncertain as to what method I should collect and send this data.

I was wondering whether the principal of more threads = more performance applied here.

My reason for doubt is because all data still has to be fed through the network connection, of which most devices only have 1 active at a time. In addition, I know that repeated context switching can reduce performance as well.

However, I've seen from other sources suggesting that multithreading does indeed scale network performance to a point, and if that's true, why?

Currently, I'm using the Non-Boost variant of ASIO to handle networking.

Thanks in advance for any assistance.

Icedude_907
  • 172
  • 3
  • 11
  • 1
    Whether or not using multiple threads (or processes) speeds things up is entirely determined on whether the overhead of creating the thread is less than the work the thread needs to do and whether or not that thread needs to synchronise with other threads regularly. Threads can be amazing *sometimes* but they are *not* a silver bullet and in some situations they slow you down. – Jesper Juhl Jul 29 '20 at 20:37
  • If you're CPU-bound you'll need to use threads, but otherwise you can get fantastic performance using a non-blocking library. – tadman Jul 29 '20 at 20:44
  • A bit dated, but still worth reading on this topic: http://www.kegel.com/c10k.html – Jeremy Friesner Jul 29 '20 at 21:07

3 Answers3

2

ASIO is a wrapper around epoll/IOCP, and as such is optimized for high-performance non-blocking I/O. It's possible to achieve hundreds of thousands of simultaneous connections with this setup on a single thread. Indeed, the old-fashioned "a thread per client" setup could never reach this level is performance due to the context switching overhead.

With that said, depending on the protocol used, handling network requests and replies takes some CPU time, and on a high-rate network it might saturate the single CPU core on which the io_service is running. In that case it is possible to parallelize the io_service so that completion routines can run on more than one core. Still no context switching would take place if the number of threads doesn't exceed the number of available CPU cores/hardware threads. Context switching occurs when the same core needs to handle multiple threads and also when switching between user and kernel mode (i.e. twice for each system call).

Benchmark your server to see how many clients it can handle on a single thread. Chances are it will be enough. Parallelizing io_service comes at a cost of having to deal with completion routines running in parallel, which almost always requires additional synchronization, which means additional overhead.

rustyx
  • 80,671
  • 25
  • 200
  • 267
  • Even if a single core was saturated, would going multicore make any difference? After all, wouldn't the OS still have to feed the data to the network card using only 1 thread? – Icedude_907 Jul 30 '20 at 19:51
  • Yes if a single core is saturated, then going multicore will help. When the CPU can't handle incoming packets in time, initially the packets are buffered, but once the buffers are full, new packets will be dropped. To the clients it will appear as "connection reset" and "connection timeout" for new connections and retransmissions and general slowness for existing connections. But again, in most cases a single core is enough. Computation-intensive tasks should be delegated to a helper thread pool, so that completion handlers are fast and lightweight and the io_service thread remains responsive. – rustyx Jul 30 '20 at 20:21
0

You want about the same number of threads as you have CPU cores, including hypertreaded ones. Not more.

Each thread deals with a subset of the sockets. That way, you maximize CPU parallelism, but minimize overhead.

If you truly need in the 100s of connections and require low latency, you should consider UDP, where a single socket can receive from many remote addresses. But you then have to implement reliability yourself. Still, that's how multi-player AAA games servers are typically run. And there's good reasons for it.

Jeffrey
  • 11,063
  • 1
  • 21
  • 42
  • 2
    I don't think the election of UDP vs TCP should be based on how many connection you are about to handle, but if you are ok with a non-reliable transport. Web-servers can handle thousands of tcp connections with no problems. – Pablo Yaggi Jul 29 '20 at 20:49
  • I never worked on web servers, so I wouldn't know. But I worked on tens of AAA game servers and I would never use TCP for low-latency high-connection-count servers. Your mileage will vary, obviously. – Jeffrey Jul 29 '20 at 20:54
  • 1
    The main consideration for a TCP-vs-UDP decision would be "how do I want the system to handle a dropped packet?" If you want the system to automatically retransmit the dropped packet so that the payload bytes still get presented to the receiving application in the same order they were sent, TCP is the thing to use. OTOH if you don't want to wait around for packet-retransmits and would rather just receive whatever data makes it through, verbatim (and are willing to accept missing/out-of-order/duplicate packets as the price of getting lower latency), then UDP is a good option for that. – Jeremy Friesner Jul 29 '20 at 21:11
-2

Multi-Threading vs Single-Threading is a hard topic, and I think it all depends on the point of view of your implementation.
If you have a good event-driven system on one thread probably using single thread for low level network IO will be better. Spawning threads have on itself a performance penalty as the system will need to attend them, of course it will be helpful to use the extra processors, but as you said when finally getting into the low level all threads will need some kind of synchronization, penalty again, unless you are using one socket per thread.

One mayor drawback of multi-threading (one socket per thread) on networks is that most of the time you system will be subject to 'slow loris' attacks. Wikipedia for slow loris Computerphile video on slow loris

So, I think you are better using multi-thread for other long waiting or time consuming tasks. Of course you should use non-blocking IO.

Pablo Yaggi
  • 1,061
  • 5
  • 14