Multithreaded design for API

Question

I'm implementing an API. The front-end will likely be REST/HTTP, back-end MSSQL, with a lightweight middle-tier in between. Probably IIS hosted.

Each incoming request will have a non-unique Id attached. Any requests that share the same Id must be processed serialised (and FIFO); whereas requests with different Id's can (and should) be processed concurrently for performance and efficiency.

Assume that when clients call my API a new thread is created to process the request (thread per call model).
Assume that every request is about the same size and same amount of computational work

Current Design

My current design and implementation is very simple and straightforward. A Monitor is created on a per-Id basis, and all requests are processed through a critical section of code that enforces serialisation. In other words, the Monitor for Id X will block all threads carrying a request with Id X until the current thread carrying Id X has completed its work.

An advantage of this current design is Simplicity (it was easy to implement). A second advantage is that there is none of the cost that comes with switching data between threads - each thread carries a request all the way from its initial arrival at the API through to sending a response back to the client.

One possible disadvantage is that where many requests arrive sharing the same Id, there will be lots of threads blocking (and, ultimately, unblocking again) Another possible disadvantage is that this design does not lend itself easily to Asynchronous code for potentially increasing scalability (scalability would likely have to be realised in other ways).

Alternate Design

Another design might consist of a more complex arrangement:

Create a dedicated BlockingCollection for each Id encountered Create a single, dedicated long-running consumer thread for each BlockingCollection
Each thread that processes a request acts as a producer by enqueing the request to the relevant BlockingCollection
The producing thread then waits (in Async style) until a response is ready to be collected
Each consumer thread processes items from its BlockingCollection Queue in a serialised manner, and signals the thread that is awaiting a response once the response is ready

A disadvantage of this design is complexity

A second disadvantage is that there will be overhead due to switching data between threads (at least twice per request)

However, I think that on a reasonably busy system, where lots of requests are coming in, this might be better at reducing the number of blocking threads; And possibly the number of threads required overall will be fewer.

It also lends itself better to Asynchronous code than the original design, which might make it scale better.

Questions

Given the effort and complexity of a re-implementation using the Alternate Design, is it likely to be worthwhile? (At the moment I am leaning towards a NO and sticking with my current design: but any views or general thoughts would be much appreciated.)

If there is no straightforward answer to the above question, then what are the key considerations that I need to factor in to my decision?

You might be able to make your current design async without much effort by using various async constructs that are very similar to their analogous thread synchronization constructs. See: https://github.com/StephenCleary/AsyncEx/wiki/AsyncLock — Dave M, Nov 24 '18 at 00:34
This is an interesting problem, and you can solve it (as you suggest) more than one way. the first way is not scalable, the second way is also not the most efficient design. Give me a sec, i think i might have something for you — TheGeneral, Nov 24 '18 at 02:57

score 0 · Answer 1 · answered Nov 24 '18 at 19:27

0

Your current solution will scale terribly if the amount of requests gets too high (requests start queuing). Each request is spawning a new thread and will allocate the necessary resources as well.
Have a look at the Actor Model.
You would spawn a thread per request ID and just push the requests via "message" to the actor.
Consider to use lazy initialization for the actors, meaning you only spawn a thread if there is actually a request for the ID going on. If the message queue of an actor is empty you can terminate them and only spawn them again, if a new request with its ID is coming in.
An implementation made with Threadpool should also help with performance in the future.

answered Nov 24 '18 at 19:27

Oachkatzl

307
1
11

"Your current solution will scale terribly if the amount of requests gets too high (requests start queuing)" - on what basis? As more requests (with the same request ID) arrive, more threads will have to block whilst waiting for their turn. But why will this arrangement in turn, "scale terribly"? I'd understood that the blocking and unblocking of threads was cheap (relative, say, to switching data between threads). Is that not so? – user3561406 Nov 29 '18 at 14:47
first of all, keeping threads in a blocked state is basically free from a processing point of view HOWEVER unblocking and especially spawning them is not free. secondly, it uses up your memory because all those threads need to have a context (data which the thread uses either for the execution or just thread information in general). if your server gets a limited amount of requests anyway this might never become a problem but, personally, i like good design in the first place more than patching things up afterwards. – Oachkatzl Nov 29 '18 at 20:06

Multithreaded design for API

1 Answers1