How does Cluster keeps up with Node's single thread concept?

Question

When you fork, or start multiple workers using something like Cluster:

Are multiple threads or instances of Node process being created ? Does this breaks Node's single thread concept?
How are the request handled between workers? Does Cluster provides some intelligent mechanism to load balance all requests to multiple workers ?

score 6 · Accepted Answer · answered Feb 25 '13 at 15:19

Cluster uses fork, and yes, it gets balanced automatically:

The worker processes are spawned using the child_process.fork method, so that they can communicate with the parent via IPC and pass server handles back and forth.

[...]

When multiple processes are all accept()ing on the same underlying resource, the operating system load-balances across them very efficiently. There is no routing logic in Node.js, or in your program, and no shared state between the workers. Therefore, it is important to design your program such that it does not rely too heavily on in-memory data objects for things like sessions and login.

You might think that this breaks node.js single thread concept if you count a new node.js instance as another thread, however, keep in mind that all callbacks to a given request are going to be handled be the same node.js instance that accepted the original request. There are no race conditions, no shared data, only fairly safe interprocess communication.

See the Cluster documentation for more information.

what about shared resources ? So, 2 forks serving their own requests, call something like `fs.readdirAsync()` on same resource, both calls will return immediately, but I think internally, there is a contest for the resourse, isn't it ? — S.D., Feb 25 '13 at 18:35
@User117: I don't know the actual implementation, but reading a directory usually implies a call to operating system facilities. This call _might_ prefer one of the processes, however, it is the job of the operating system to schedule tasks with the same priority accordingly. But after all, the great strength of node.js is it's asynchronism - do something else while you wait for the callback. — Zeta, Feb 25 '13 at 20:01

score 4 · Answer 2 · edited May 23 '17 at 11:44

Cluster was made developed to compensate of node.js's single thread architecture. Modern processors have multiple cores and a single threaded process will not be able to take advantage of the available cores. It does deviate from its single thread architecture, but it was never the plan to stick to it. The main concept was asynchronous, event-driven execution.

Cluster uses fork to create processes. A forked process really is its own process with its own address space - there is nothing that the child can do (normally) to affect its parent's or siblings address space (unlike a thread). In addition to having all the methods in a normal ChildProcess instance, the returned object has a communication channel built-in. All forked processes can communicate using this channel.

Notice the subtle difference here : it is not multi-threaded, it just forks to create new independent processes. See here Threads vs Processes in Linux to compare them. Each worker assumes single-threaded architecture like before. So it does not break node's single thread concept.

The balancing of load depends on your code itself (since each is independent) and the OS. The load is balanced equally among all forked processes and original process alike, by the OS.

But if you wish to do it differently, it is also possible. If you use master thread differently than worker, or each worker specializing different tasks(compressing/ffmpeg) you can do that.

How does Cluster keeps up with Node's single thread concept?

2 Answers2