2

I'm learning node.js from the Node Beginner Book and the subsequent ebook purchase. In the book, Manuel Kiessling explains that a line of blocking code like this one:

fs.readFileSync(blah);

will block the entire node process and ALL requests coming in. This is really bad for a multi-user website!

This is the example, Kiessling uses:

exec("ls -lah", function( error, stdout, stderr ) {
    response.writeHead(200, {"Content-Type": "text/plain"});
    response.write(stdout);
    response.end();
});

This is the code that tricked me. He says the ls -lah could easily be replaced with a more time consuming operation like find / -name "*" or a database lookup. I assumed the expensive and blocking operation would somehow run in the background explicitly because of the async callback.

So I got to testing my theory with this code:

var http = require("http");
var url = require("url");

badSleep = function(ms) {
    var now = new Date();
    var finishAtDate = now.getTime() + ms;
    console.log("CPU burning sleep for " + ms + " milliseconds");
    while(new Date() < finishAtDate) {
       // do nothing
    }
};

asyncWrapper = function(callback) {
    //badSleep(3000);
    callback();
}

http.createServer(function(request, response) {
    var pathname = url.parse(request.url).pathname;
    console.log("Serve up " + pathname);
    if (pathname == '/favicon.ico') {
        response.writeHead(404);
        response.end();
    } else {
        asyncWrapper(function() {
            badSleep(3000);
            response.writeHead(200, {"Content-Type": "text/plain"});
            response.write("\nI was wrong " + new Date());
            response.end();
        });
    }
}).listen(8888);

The thing is, no matter where I put the sleep, it still blocks the node event loop. The callback does not solve the blocking problem. The good users of SO told me this also in the comments.

So how does exec do it??? I was baffled, so I went and looked at the child-process code at github. I found out that exec calls spawn!!! It makes a child process! Mystery solved. The async code does not 'solve' the blocking issue, the spawn does.

That leads me to my question. Does express somehow solve the blocking problem or do you still have to worry about it?

PS: This question is a major re-write. I want to beg the pardon of the below SO users and thank them for being patient with me. I have definitely learned something here.

Community
  • 1
  • 1
Jess
  • 23,901
  • 21
  • 124
  • 145
  • nodejs functions are all non-blocking, so i think yes. only time i use "readFileSync" is when server is loading and i read the https files. – IdanHen Apr 08 '13 at 10:58
  • http://blog.mixu.net/2011/02/01/understanding-the-node-js-event-loop/ – bryanmac Apr 08 '13 at 11:23
  • I want to thank you all for helping me understand. These are all really good answers. It's hard for me to decide on the best one. – Jess Apr 09 '13 at 03:01

3 Answers3

4

Someone posted a comment about understanding-the-node-js-event-loop. Yes exactly. You can issue a blocking call if you wrap it in an asynchronous call, because you are not blocking the node event loop.

If you wrap a synchronous call with an asynchronous call you are still going to run into blocking. For example, if you write something like this:

fs.readFile("file1.txt", function(err, data1) {
   var data2 = fs.readFileSync("file2.txt");
});

The process won't be blocked when it reads file1.txt since you are using an async call, however, as soon as it is done reading file1 and it reaches the line where it reads file2 then it will block.

By issuing a synchronous/blocking call inside an asynchronous/non-blocking call you are only delaying the blocking.

You are correct that blocking for the entire web site is really bad which is why you should not issue blocking calls very frequently. Since node.js was written from the ground up most of the I/O calls are asynchronous by default and you should use those as much as possible instead of synchronous calls.

The question is, does express handle this for you automatically or do you still have to worry about it?

You still have to worry about it.

Hector Correa
  • 26,290
  • 8
  • 57
  • 73
  • "But if you can get the blocking out of the main node event loop, it will not block all requests" You should realize that when it finishes executing the async call (reading file1 in my example) the control goes BACK to the event loop. Hence when it reads file2 the execution is already back on the Node.js thread and will block while it read it. – Hector Correa Apr 08 '13 at 13:26
  • "Your answer about express... can you clarify or cite a reference?" Express is just a framework that provides nice functions to handle common things that you need in a web app (like routes, parsing HTML forms, rendering views), but it does not change the way Node.js works (e.g. it does not add multi-threading or anything like that) – Hector Correa Apr 08 '13 at 13:28
2

The question is, does express handle this for you automatically or do you still have to worry about it?

You still have to worry about it. NodeJS is single-threaded, which means that every synchronous operation will block it entirely, no matter where it is called. Neither Express nor any other framework can use synchronous operations without blocking the server. Simple

var x = 1;

already blocks entire server until it finishes creating new variable and assigning new value to it.

The whole point of asynchronous architecture is that it is more efficient then threads. And don't be fooled, asynchronous programming is more difficult then threads because there is no isolation. If one thread fails other still work, while in asynchronous server one exception can break entire server.

The problem is that you could block the main node event loop!

This sentence suggests that NodeJS has something more then the main event loop. That's not true. Every code is called inside the main loop.

Also have a look at this:

Event Loop vs Multithread blocking IO

Community
  • 1
  • 1
freakish
  • 54,167
  • 9
  • 132
  • 169
  • _"every synchronous operation will block it entirely, no matter where it is called"_. According to the tutorial I am doing this is not true. – Jess Apr 08 '13 at 13:21
  • You were right. I was misunderstanding the tutorial. +1. One of your comments says _"[you can't] use synchronous operations without blocking the server. Simple"_. But you can if you `fork` or `spawn`. – Jess Apr 09 '13 at 02:52
1

Any call to a non-asynchronous function will "block", even if it's wrapped in another function. The only exception is if the wrapper function can defer processing to another thread/process (e.g., like the cluster API).

Xophmeister
  • 8,884
  • 4
  • 44
  • 87
  • @bryanmac I thought the whole point of node is that it does not use threads and therefore, you do not have to write complicated locking mechanisms to share mutable data among threads. Isn't the whole point of node to use asynchronous call backs instead of threading? – Jess Apr 08 '13 at 12:42
  • 2
    @Jessemon There is nothing outside of the node event loop (in one NodeJS process). :D You cannot get there, whoever told you that he lied to you. – freakish Apr 08 '13 at 13:32
  • 1
    @Jessemon That's not how it works; you are misinterpreting whatever you are reading. Node's "event loop" is just the name given to its VM's execution sequence: *All* code is blocking, whether it's completed in a few CPU cycles, or if it takes two hours to finish. ("CPU bound" is actually the correct terminology.) Wrapping a blocking call in an asynchronous call (that does not farm things out to an external process) will **not** magically get you outside the design of the VM! In fact, if anything, it'll only add to the stack and slow down your code. – Xophmeister Apr 08 '13 at 14:25
  • @Jessemon - yes, the point of node is the default behavior is async code via the event loop but that doesn't rule out the ability to put some CPU bound or IO work on another thread or process. It just means the mainline behavior isn't to use a thread/process. It means the developer makes a conscious decision to do that instead of it always taking the overhead. – bryanmac Apr 09 '13 at 01:16
  • @Xophmeister: If you edit the question I will up vote it. (My vote is locked) I was laboring under a misunderstanding. Now what you say makes sense. – Jess Apr 09 '13 at 02:56