Worker threads vs node cluster for server application with internal data exchange

Question

I understand that questions like this can be a matter of opinion, which is why I tried to provide more details.

I'm developing a WebSocket server app with the following 'features':

2-3 connections can point to the same User
Users can occasionally interact with each other. Either in bursts or in continuous exchange of data
A portion of global data should be always synced between all users
A portion of global data should be lazily (on demand) synced between some users

2 possible structures come to mind:

Worker threads. Main thread handles all network IO, connection->User bundling and stores/interacts with global data. It also sends lots of postMessage-s to worker threads. Worker threads occasionally talk to each other when 2 players from different threads need to interact. Possible issues:
- postMessage overhead (mostly from serialization from what I understand). Theoretically for most messages I could try to use SharedArrayBuffer. I don't think it could be optimized beyond that.
- Main thread would handle both network IO and global data interactions. Maybe this could become a bottleneck?
Stateless NodeJS cluster that stores all data in Redis and also utilizes occasional locking. Issues/questions:
- Would use more RAM
- No matter how fast Redis is, communicating with it will most likely be slower than using postMessage-s.
- Will multiple servers actually be benefitial? I heard that splitting the same type IO into multiple processes is a bad idea. Or at least is not as benefitial as using native Node asynchronous IO.
- I don't think there is a way to bundle connections based on something like User id.
- Getting and setting Redis data on request is fine. But how would one process know if some key was modified by another process?

What problem are you most trying to solve with the workers/clusters? You seem to have left out the defining motivation that could guide which way to go. Ability to handle more CPU-intensive requests? Ability to scale to more webSocket connections? If scaling, how big do you need to get because larger scale will involve no just multiple clustered processes on the same system, but multiple systems which will work better with a centralized shared store like Redis. Also, how much memory do you have on your server and how many additional processes do you think you need? — jfriend00, Feb 11 '22 at 06:55
I'm trying to use extra threads for CPU-intensive requests. I don't think that I will scale beyond one machine. I have no info on memory on process atm. — user64675, Feb 11 '22 at 15:02

jfriend00 · Answer 1 · 2022-02-11T16:15:05.070

It sounds like you're perhaps in the design phase and don't yet have running code that you can measure. If that's the case, do not assume you know where your bottlenecks are yet and spend time designing and implementing for a problem that has not yet been measured. Performance bottlenecks are often not where you think they might be. Spending lots of time implementing a more complex architecture to address where you think a bottleneck might be can very easily be a lot of wasted development time if you're wrong.

Instead, implement the simpler architecture or even prototype and test and measure at scale. Find out where your real bottlenecks are. Then, you will be sure your are focusing on the right problem and will have a test engine that you can measure when you implement an alternative to see how well you did.

If the main focus of your question is addressing the CPU intensive operations in your code, I would probably first explore using worker threads and a job queue. Still have one main process that the webSockets connect to and where you keep all your state.

When a request comes in, create a job and put it in the job queue (the job queue can just be an array of job objects waiting to be processed). If there's an idle thread, grab the oldest job from the job queue and send that thread the job to do. It will crunch on it in whatever CPU intensive way it needs to. When it finishes, it messages the result back to your main thread and your main thread sends it the next job and does whatever it needs to with the result.

You can have as many worker threads as you have CPU cores (more than that loses efficiency).

This architecture tries to localize the CPU intensive operations and farm them off to worker threads while keeping the rest of the architecture (your webSocket connections and state management) the same to keep things as simple as possible.

If large amounts of state needs to be accessed by the CPU-intensive operations, then probably you want to move the state to a database where multi-threaded access to the state can be managed by the database (that's one of the helpful things that a database does). If the state is not persistent, the database could be Redis. This is generally preferred over a synchronizing architecture because managing concurrency when updating the state and synchronizing in any general way is probably not something you want to implement yourself (it's a hard problem).

Worker threads vs node cluster for server application with internal data exchange

1 Answers1