1

I have an array with say.. 100000 objects. I use the map function and on each iteration, I build a string and write the content to a CSV like so:

  entriesArray.map((entry) => {
    let str = entry.id + ',' + entry.fname + ',' + entry.lname + ',' +
    entry.address + ',' + entry.age + ',' + entry.sex + '\n'
    writeToFile(str);
  });

The writeToFile function:

const writeToFile = (str) => {
  fs.appendFile(outputFileName + '.csv', str, (err) => {
    if (err) throw err;
  });
};

This works as expected, but i'm concerned if having so many asynchronous write operations could lead to any data inconsistencies. So my question is, is this safe? Or is there a better way to do it.

Btw, The same code on a MAC OS threw the error Error: ENFILE: file table overflow, open 'output.csv'. On a bit of research, I learned that this is due to OSX having a very low open file limit. More details on this can be found here.

Again I'm hoping an improvement to my file write mechanism could sort this issue out as well.

Priyath Gregory
  • 927
  • 1
  • 11
  • 37
  • Any reason as to why this has been downvoted? – Priyath Gregory Feb 11 '18 at 17:04
  • You really shouldn't use `map` if you don't need to produce another array. Use `forEach`, or even better a plain loop. – Bergi Feb 11 '18 at 17:09
  • That should be `.forEach()`, not `.map()`, and opening/closing the file for each line is probably not the best idea, performance-wise. – Pointy Feb 11 '18 at 17:09
  • 2
    You can use a stream and then the stream will make sure your writes are sequenced properly. You will, of course, have to pay attention to flow control if the stream says it is full. Or you can properly sequence your async writes one after the other. Or you can build all the text together you want to write and do one write at the end which is probably the easiest way to do this. – jfriend00 Feb 11 '18 at 17:09
  • You are correct to worry. This is not good code as there is no guarantee of write order. – jfriend00 Feb 11 '18 at 17:10
  • Have a look at https://stackoverflow.com/questions/40292837/can-multiple-fs-write-to-append-to-the-same-file-guarantee-the-order-of-executio – Bergi Feb 11 '18 at 17:14
  • possible duplicate of [What are the atomicity guarantees of `fs.appendFile()`?](https://stackoverflow.com/q/44464284/1048572) – Bergi Feb 11 '18 at 17:14

1 Answers1

3

You are correct to realize that this is not a good way to code as there are no guarantees of order with asynchronous writing (particularly if the writes are large and may take more than one actual write operation to disk). And, remember that fs.appendfile() actually consists of three asynchronous operations fs.open(), fs.write() and fs.close(). And, as you have seen, this opens a lot of file handles all at once as it tries to do every single write in parallel. None of that is necessary.

I'd suggest you build the text you want to write as a string and do one write at the end as there appears to be no reason to actually write each one separately. This will also be a lot more efficient:

writeToFile(entriesArray.map((entry) => {
    return entry.id + ',' + entry.fname + ',' + entry.lname + ',' +
        entry.address + ',' + entry.age + ',' + entry.sex + '\n';
}).join(""));

Let's say you had 1000 items in your entriesArray. Your scheme was doing 3000 disk operations open, write and close for every single entry. My suggested code does 3 disk operations. This should be significantly faster and have a guaranteed write order.


Also, you really need to think about proper error handling. Using something like:

if (err) throw err;

inside an async callback is NOT proper error handling. That throws into an async event which you have no ability to ever handle. Here's on scheme:

const writeToFile = (str, fn) => {
  fs.appendFile(outputFileName + '.csv', str, (err) => {
    fn(err);
  });
};

writeToFile(entriesArray.map((entry) => {
    return entry.id + ',' + entry.fname + ',' + entry.lname + ',' +
        entry.address + ',' + entry.age + ',' + entry.sex + '\n';
}).join(""), function(err) {
    if (err) {
       // error here
    } else {
       // success here
    }
});
jfriend00
  • 683,504
  • 96
  • 985
  • 979
  • Thank you for the response. I do have a lot of objects in the array, could appending them all together and doing a single write cause any performance issues? – Priyath Gregory Feb 11 '18 at 17:15
  • 1
    @fsociety - You're just creating an array of strings. Unless it's so large that you have memory issues, it should perform lots better to do only one disk write than to do thousands of disk writes. And, if it was that large, then just run this several times on segments of the `entriesArray()` - waiting for the prior one to finish before doing the next one. You just don't want to be writing every single item independently. – jfriend00 Feb 11 '18 at 17:17