2

I am working on post method in the server side to retrieve all files inside the requested directory (not recursive), and below is my code.

I am having difficulty sending the response back (res.json(pathContent);) with the updated pathContent without using the setTimeout.

I understand that this is due to the asynchronous behavior of the file system methods used (readdir and stat) and need to use some sort of callback, async, or promise technique.

I tried to use the async.waterfall with the entire body of readdir as one function and the res.json(pathContent) as the other, but it didn't send the updated array to the client side.

I know that there have been thousands of questions regarding this asynchronous operation but could not figure out how to solve my case after reading number of posts.

Any comments would be appreciated. Thanks.

const express = require('express');
const bodyParser = require('body-parser');
const fs = require('fs');
const path = require('path');

var pathName = '';
const pathContent = [];

app.post('/api/files', (req, res) => {
    const newPath = req.body.path;
    fs.readdir(newPath, (err, files) => {
        if (err) {
            res.status(422).json({ message: `${err}` });
            return;
        }
        // set the pathName and empty pathContent
        pathName = newPath;
        pathContent.length = 0;

        // iterate each file
        const absPath = path.resolve(pathName);
        files.forEach(file => {
            // get file info and store in pathContent
            fs.stat(absPath + '/' + file, (err, stats) => {
                if (err) {
                    console.log(`${err}`);
                    return;
                }
                if (stats.isFile()) {
                    pathContent.push({
                        path: pathName,
                        name: file.substring(0, file.lastIndexOf('.')),
                        type: file.substring(file.lastIndexOf('.') + 1).concat(' File'),
                    })
                } else if (stats.isDirectory()) {
                    pathContent.push({
                        path: pathName,
                        name: file,
                        type: 'Directory',
                    });
                }
            });
        });
    });    
    setTimeout(() => { res.json(pathContent); }, 100);
});
sheIsTrue
  • 63
  • 2
  • 9
  • Take a look at this post, looks like they used the synchronous methods https://stackoverflow.com/questions/44019316/chaining-fs-readdir-with-a-then-to-return-an-array – hawkstrider Jan 04 '19 at 20:09
  • Thank you for the quick reply! Based on the reference, I used the synchronous methods (readdirSync and statSync) and got it to work. – sheIsTrue Jan 04 '19 at 20:46

4 Answers4

11

The easiest and cleanest way would be use await/async, that way you can make use of promises and the code will almost look like synchronous code.

You therefor need a promisified version of readdir and stat that can be create by the promisify of the utils core lib.

const { promisify } = require('util')

const readdir = promisify(require('fs').readdir)
const stat = promisify(require('fs').stat)

async function getPathContent(newPath) {
  // move pathContent otherwise can have conflicts with concurrent requests
  const pathContent = [];

  let files = await readdir(newPath)

  let pathName = newPath;
  // pathContent.length = 0;  // not needed anymore because pathContent is new for each request

  const absPath = path.resolve(pathName);

  // iterate each file

  // replace forEach with (for ... of) because this makes it easier 
  // to work with "async" 
  // otherwise you would need to use files.map and Promise.all
  for (let file of files) {
    // get file info and store in pathContent
    try {
      let stats = await stat(absPath + '/' + file)
      if (stats.isFile()) {
        pathContent.push({
          path: pathName,
          name: file.substring(0, file.lastIndexOf('.')),
          type: file.substring(file.lastIndexOf('.') + 1).concat(' File'),
        })
      } else if (stats.isDirectory()) {
        pathContent.push({
          path: pathName,
          name: file,
          type: 'Directory',
        });
      }
    } catch (err) {
      console.log(`${err}`);
    }
  }

  return pathContent;
}

app.post('/api/files', (req, res, next) => {
  const newPath = req.body.path;
  getPathContent(newPath).then((pathContent) => {
    res.json(pathContent);
  }, (err) => {
    res.status(422).json({
      message: `${err}`
    });
  })
})

And you should not concatenated paths using + (absPath + '/' + file), use path.join(absPath, file) or path.resolve(absPath, file) instead.

And you never should write your code in a way that the code executed for the request, relays on global variables like var pathName = ''; and const pathContent = [];. This might work in your testing environment, but will for sure lead to problems in production. Where two request work on the variable at the "same time"

t.niese
  • 39,256
  • 9
  • 74
  • 101
  • If you `map` over the files with an async function and `Promise.all`, you get a bit of a performance boost, because all the stats can run in parallel, instead of sequentially. (`pathContent = await Promise.all(files.map(async (file) =>{await stat(…);return {path: pathname …}}))`) – Garrett Motzner Jan 04 '19 at 20:26
  • 1
    @GarrettMotzner That depends on the use-case if you have a bunch of parallel request hitting your system, then you might want to stay with a sequential solution per request. – t.niese Jan 04 '19 at 20:30
  • That is an interesting concern, but I think that would better be solved by rate limiting elsewhere. To me, it doesn't make sense in most cases to make the stat calls sequential, since then have no dependency between the calls. There may be some extreme cases where you end up with too many parallel system calls, but if that is the case, you probably have bigger problems. – Garrett Motzner Jan 04 '19 at 20:37
  • @GarrettMotzner I don't want to contradict to what you initially said. And yes the rate limit has then to be added elsewhere. But having it sequential then the load in relation to the number of currently active requests is more predictable, and the rate limit can be easier handled. But which solution is better depends on the use-case. If this is a rare action then a peek with map and Promise.all might be better. – t.niese Jan 04 '19 at 20:44
  • Would either of you mind explaining why readdir does not need to be included in the try-catch block but just stat? Thanks! – sheIsTrue Jan 05 '19 at 02:45
  • @sheIsTrue the error case of `readdir` is catched by the second callback if the `getPathContent(newPath).then(´ – t.niese Jan 05 '19 at 14:56
  • @t.niese Thanks for the explanation. Then, why can't `stat` be handled the same way as `readdir` and have to be inside the try-catch block? Sorry I am very new to this. – sheIsTrue Jan 05 '19 at 16:20
  • @sheIsTrue depends on what you want to achieve. Your original code only skips those files for which an error occurred. If you remove the `try-catch` block then the `422` would be emitted if stats would result in an error for at least one of the files, even if stats would give a valid result for the other files. – t.niese Jan 05 '19 at 17:42
1

Based on the initial comment I received and the reference, I used readdirSync and statSync instead and was able to make it work. I will review other answers as well and learn about other ways to implement this.

Thank you all for your kind inputs.

Here is my solution.

const express = require('express');
const bodyParser = require('body-parser');
const fs = require('fs');
const path = require('path');

var pathName = '';
const pathContent = [];

app.post('/api/files', (req, res) => {
    const newPath = req.body.path;

    // validate path
    let files;
    try {
        files = fs.readdirSync(newPath);
    } catch (err) {
        res.status(422).json({ message: `${err}` });
        return;
    }

    // set the pathName and empty pathContent
    pathName = newPath;
    pathContent.length = 0;

    // iterate each file
    let absPath = path.resolve(pathName);
    files.forEach(file => {
        // get file info and store in pathContent
        let fileStat = fs.statSync(absPath + '/' + file);
        if (fileStat.isFile()) {
            pathContent.push({
                path: pathName,
                name: file.substring(0, file.lastIndexOf('.')),
                type: file.substring(file.lastIndexOf('.') + 1).concat(' File'),
            })
        } else if (fileStat.isDirectory()) {
            pathContent.push({
                path: pathName,
                name: file,
                type: 'Directory',
            });
        }
    });
    res.json(pathContent);
});
sheIsTrue
  • 63
  • 2
  • 9
  • You really should move `var pathName ` and `const pathContent` into your `(req, res) => { ... } ` so that they are no global anymore, otherwise you will get _"random"_ unexpected wrong results if you have two or more requests at the same time. – t.niese Jan 04 '19 at 20:48
  • In this case (this is for a school assignment), there will only be one request at a time. The server need to hold these as global variables for other requests (i.e. get) to get the same information as well. (unless there is a better approach) I really appreciate your valuable input btw. I just tested your code and verified that it works. Accepting your response as the answer. – sheIsTrue Jan 04 '19 at 21:23
0

Here's some options:

  • Use the synchronous file methods (check the docs, but they usually end with Sync). Slower, but a fairly simple code change, and very easy to understand.
  • Use promises (or util.promisify) to create a promise for each stat, and Promise.all to wait for all the stats to complete. After that, you can use async functions and await as well for easier to read code and simpler error handling. (Probably the largest code change, but it will make the async code easier to follow)
  • Keep a counter of the number of stats you have done, and if that number is the size you expect, then call res.json form inside the stat callback (smallest code change, but very error prone)
Garrett Motzner
  • 3,021
  • 1
  • 13
  • 30
0

There is different way to do it :

  1. You can first promisify the function with using new Promise() then second, use async/await or .then()
  2. You can use the function ProsifyAll() of the Bluebird package (https://www.npmjs.com/package/bluebird)
  3. You can use the synchrone version of the fs functions
Nerwin
  • 69
  • 3