I'm building a TCP-based daemon for pre-/post-processing of HTTP requests. Clients will connect to Apache HTTPD (or IIS), and a custom Apache/IIS module will forward requests to my TCP daemon for further processing. My daemon will need to scale up (but not out) to handle significant traffic, and most requests will be small and short-lived. The daemon will be built in C++, and must be cross-platform.
I'm currently looking at the boost asio libraries, which seem like a natural fit. However, I'm having trouble understanding the merits of the stackless coroutines vs thread pool pattern. Specifically, I'm looking at HTTP server example #3 and HTTP server example #4 here: http://www.boost.org/doc/libs/1_49_0/doc/html/boost_asio/examples.html
Despite all of my googling, I'm unable to fully comprehend the merits of the stackless coroutine server, and how it would perform relative to the thread pool server on a multi-core system.
Which of the two is most appropriate given my requirements, and why? Please, feel free to 'dumb down' your answers regarding the stackless coroutine idea, I'm still on shaky ground here. Thanks!
Edit: Another random thought/concern for discussion: Boost HTTP server example #4 is described as "a single-threaded HTTP server implemented using stackless coroutines". OK, so it's entirely single-threaded (right? even after the parent process 'forks' to a child? see server.cpp in example #4)...will the single thread become a bottleneck on a multi-core system? I'm assuming that any blocking operations will prevent all other requests from executing. If this is indeed the case, to maximize throughput I'm thinking a coroutine-based receive-data async event, a thread pool for my internal blocking tasks (to leverage multi cores), and then an async send & close connection mechanism. Again, scalability is critical. Any thoughts?