I'm implementing an API. The front-end will likely be REST/HTTP, back-end MSSQL, with a lightweight middle-tier in between. Probably IIS hosted.
Each incoming request will have a non-unique Id attached. Any requests that share the same Id must be processed serialised (and FIFO); whereas requests with different Id's can (and should) be processed concurrently for performance and efficiency.
- Assume that when clients call my API a new thread is created to process the request (thread per call model).
- Assume that every request is about the same size and same amount of computational work
Current Design
My current design and implementation is very simple and straightforward. A Monitor is created on a per-Id basis, and all requests are processed through a critical section of code that enforces serialisation. In other words, the Monitor for Id X will block all threads carrying a request with Id X until the current thread carrying Id X has completed its work.
An advantage of this current design is Simplicity (it was easy to implement). A second advantage is that there is none of the cost that comes with switching data between threads - each thread carries a request all the way from its initial arrival at the API through to sending a response back to the client.
One possible disadvantage is that where many requests arrive sharing the same Id, there will be lots of threads blocking (and, ultimately, unblocking again) Another possible disadvantage is that this design does not lend itself easily to Asynchronous code for potentially increasing scalability (scalability would likely have to be realised in other ways).
Alternate Design
Another design might consist of a more complex arrangement:
- Create a dedicated BlockingCollection for each Id encountered Create a single, dedicated long-running consumer thread for each BlockingCollection
- Each thread that processes a request acts as a producer by enqueing the request to the relevant BlockingCollection
- The producing thread then waits (in Async style) until a response is ready to be collected
- Each consumer thread processes items from its BlockingCollection Queue in a serialised manner, and signals the thread that is awaiting a response once the response is ready
A disadvantage of this design is complexity
A second disadvantage is that there will be overhead due to switching data between threads (at least twice per request)
However, I think that on a reasonably busy system, where lots of requests are coming in, this might be better at reducing the number of blocking threads; And possibly the number of threads required overall will be fewer.
It also lends itself better to Asynchronous code than the original design, which might make it scale better.
- Questions
Given the effort and complexity of a re-implementation using the Alternate Design, is it likely to be worthwhile? (At the moment I am leaning towards a NO and sticking with my current design: but any views or general thoughts would be much appreciated.)
If there is no straightforward answer to the above question, then what are the key considerations that I need to factor in to my decision?