1

I am a newbie to node.js. I am currently reading the book called 'Beignning Node.js' by Basarat Ali Syed.

Here is an excerpt from it which states the disadvantage of thread pool of traditional web servers:

Most web servers used thread pool this method a few years back and many continue to use today. However, this method is not without drawbacks. Again there is wasting of RAM between threads. Also the OS needs to context switch between threads (even when they are idle), and this results in wasted CPU resources.

I don't quite understand why there is context switch between threads inside thread pool. As far as I could understand, one thread will last during the duration of a task. And once the task is completed, the thread will be free to receive the next task.

So My Q1: Why does it need context switch? When will the context switch between threads happen?

My Q2: Why does not node.js use multiple threads to handle events in the event queue? Isn't it more efficient and reduce the queuing time of events?

Shaohua Huang
  • 698
  • 5
  • 19

1 Answers1

1

Context switch is when the OS need to run more threads than there are CPU cores. Say for example you have 10 threads. And they are all busy (meaning none of them have finished completing their tasks). But your CPU is only a dual core CPU (assume no hyperthreading for simplicity). So, how can all 10 threads run? It's not possible!!

The answer is context switch. The OS, when presented with lots of processes and threads to execute, will allocate a certain amount of time for each thread to run. After this time the OS will switch to another thread so that all threads will get some time to use the CPU.

The term "context switch" refers to the fact that when the OS needs to give the CPU to another thread/process it needs to copy all the values in registers temporarily to that thread's memory otherwise the other process/thread will mess up the calculation of the switched thread when it resumes. The OS will also need to re-point the virtual memory tables so that two processes will not mess up each other's memory. How expensive this operation is depends on the CPU architecture. Some architectures like the Sparc are optimized for context switching. Hyperthreading is a feature that implements context switching in hardware so it's faster (but then again, you only get one extra context per CPU with Hyperthreading as implemented on Intel/AMD64 architecture).

Not using multiple threads completely avoids context switching. Especially if your program is the only program running. So on a single core CPU, a nonblocking, single-threaded program can often beat a multithreaded program.

However, it's rare to find a single core CPU these days. The ideal number of threads you'd want to run is equal to the number of cores you have. Doing so would also avoid context switching. But even so, getting a complex multithreaded program to run fast is not easy. It's easier to get a nonblocking singlethreaded program to run fast. And in most web applications a multithreaded program wouldn't have any advantage over a nonblocking singlethreaded program because they're both I/O bound.

A nonblocking singlethreaded program is basically implementing thread-like behavior in userspace using events. This is sometimes called "green threads" in languages that support syntax that make event-oriented programming look like multithreaded programming.

slebetman
  • 109,858
  • 19
  • 140
  • 171
  • Thanks, slebetman, for your so comprehensive answer. It really helps. Now I could understand why people say node.js is not capable of running CPU consuming tasks. It is because node.js only have one thread. And without context switching, it has to wait for that task to be completely fininshed before it is able to execute the next task. – Shaohua Huang Oct 13 '15 at 08:20
  • But this will not be a problem for multi-threading system. Because the OS can easily assign a certain time to run the task. If other urgent tasks come, OS can do context switching. Once the urgent task is completed, it can resume the previous task again without any headache. – Shaohua Huang Oct 13 '15 at 08:23
  • @ShaohuaHuang: The key insight here is that context switching is not cheap. A single threaded process that manage the tasks in an event loop can easily beat a multithreaded process that depend on the OS to manage tasks. One of the most startling example was when tclhttpd beat Apache back in the early 2000s. Tclhttpd was written in a very slow programming language, Tcl, and Apache was written in C. Yet Apache was slower at serving static files. The key difference is that Apache is multithreaded but tclhttpd was nonblocking, singlethreaded. – slebetman Oct 13 '15 at 08:29
  • Cool. I think now I have a strong reason to go on with Node.js. – Shaohua Huang Oct 13 '15 at 08:39
  • Note that if you do need threading you can do it in node. There are several npm packages that implement various forms of threading. It's just that you don't need threading to handle connections - only CPU intensive tasks. To me the biggest advantage of node is being able to share model and validation code between the server and the browser. – slebetman Oct 13 '15 at 08:42
  • Yeah, both frontend and backend using javascript and being able to share some code is one of the things that motivates me to learn it. And it is really good news that there are packages to implement threading, so I do not need to freak out when facing intensive tasks. – Shaohua Huang Oct 13 '15 at 09:03