0

Admittedly I'm a novice with node, but it seems like this should be working fine. I am using multiparty to parse a form, which returns an array. I am then using a for each to step through the array. However - the for each is not waiting for the inner code to execute. I am a little confused as to why it is not, though.

var return_GROBID =  function(req, res, next) {
  var form = new multiparty.Form();

  var response_array = [];
  form.parse(req, function(err, fields, files) {

    files.PDFs.forEach(function (element, index, array) {
      fs.readFile(element.path, function (err, data) {
        var newPath = __dirname + "/../public/PDFs/" + element.originalFilename;
        fs.writeFile(newPath, data, function (err) {
          if(err) {
            res.send(err);
          }
          GROBIDrequest.GROBID2js(newPath, function(response) {
            response_array.push(response);
            if (response_array.length == array.length) {
                res.locals.body = response_array;
                next();
            }
          });
        });
      });
    });
 });
}

If someone can give me some insight on the proper way to do this that would be great.

EDIT: The mystery continues. I ran this code on another machine and IT WORKED. What is going on? Why would one machine be inconsistent with another?

  • Why does this code not use `fields` or `files` from the `form.parse()`? It appears that the `form.parse()` is not used at all. – jfriend00 Jan 07 '18 at 20:02
  • Also, where does `PDFs` come from? – jfriend00 Jan 07 '18 at 20:10
  • Sorry, I have a typo there. I will update. It should be files.PDFs.forEach, I mis-copied it – John M. Kovachi Jan 07 '18 at 21:10
  • Are you sure that assignment is what you want to there: if (response_array.length = array.length) {... – AGoranov Jan 07 '18 at 21:36
  • Yep, I fixed that. The actual problem, it seemed, was that my package.json had node 8.9.1 specified as the version, but I had 9.3.0 on my machine... I downgraded and it works again – John M. Kovachi Jan 07 '18 at 21:42
  • Can you be more specific about exactly what incorrect behavior is occuring? Is the response array out of order? Is the response incomplete? Since you have asynchronous operations in your loop (`readFile`, `writeFile`, and `GROBIDjs`), there is **no guarantee** about what order your response array will resolve in. The code working as you intend could just be a result of timing happening, by chance, how you intend it to work. – Greg Rozmarynowycz Jan 07 '18 at 21:58

1 Answers1

-2

I'd guess the PDFs.forEach is you just calling the built-in forEach function, correct?

In Javascript many things are asynchronous - meaning that given:

linea();
lineb();

lineb may be executed before linea has finished whatever operation it started (because in asynchronous programming, we don't wait around until a network request comes back, for example).

This is different from other programming languages: most languages will "block" until linea is complete, even if linea could take time (like making a network request). (This is called synchronous programming).

With that preamble done, back to your original question:

So forEach is a synchronous function. If you rewrote your code like the following, it would work (but not be useful):

 PDFs.forEach(function (element, index, array) {
   console.log(element.path)
 }

(console.log is a rare synchronous method in Javascript).

But in your forEach loop you have fs.readFile. Notice that last parameter, a function? Node will call that function back when the operation is complete (a callback).

Your code will currently, and as observed, hit that fs.readFile, say, "ok, next thing", and move on to the next item in the loop.

One way to fix this, with the least changing the code, is to use the async library.

async.forEachOf(PDFs, function(value, key, everythingAllDoneCallback) {

 GROBIDrequest.GROBID2js(newPath, function(response) {
        response_array.push(response);
        if (response_array.length = array.length) {
          ...
        }

        everythingAllDoneCallback(null)

} );

With this code you are going through all your asynchronous work, then triggering the callback when it's safe to move on to the next item in the list.

Node and callbacks like this are a very common Node pattern, it should be well covered by beginner material on Node. But it is one of the most... unexpected concepts in Node development.

One resource I found on this was (one from a set of lessons) about NodeJS For Beginners: Callbacks. This, and playing around with blocking (synchronous) and non-blocking (asynchronous) functions, and hopefully this SO answer, may provide some enlightenment :)

RyanWilcox
  • 13,890
  • 1
  • 36
  • 60
  • Thank you so much for your detailed answer. So, I'm trying to use the async library, and for some reason the code is still doing something wonky. For some reason, the code execution is reaching the inner block, but the files are not being read properly. Do you have any idea the cause of this? – John M. Kovachi Jan 07 '18 at 19:30
  • Well, assuming response_array.length = array.length might be the bug (assuming it's not a typo): should be == instead. Beyond that don't have any particular insight without stepping through it line by line in a debugger – RyanWilcox Jan 07 '18 at 19:46
  • 1
    `lineb()` does not execute until `linea()` returns. Your answer is misleading in that way. Now, there may be async operations inside of `linea()` that are not yet done, but `lineb()` will not execute until `linea()` returns. – jfriend00 Jan 07 '18 at 20:04
  • So I just ran this on another machine and it worked. I have no idea what is going on. I tried to convert these to promises with no luck. – John M. Kovachi Jan 07 '18 at 21:12
  • I had 9.3.0 on this machine and 8.9.1 on my other one, I downgraded and my original code works again... – John M. Kovachi Jan 07 '18 at 21:43