1

I'm trying to read a CSV file with node.js using the csv-parser library.

Since it's a big file, I need to check the header and the first 100 rows and the stop the method and return true if everything is ok or false if the data doesn't respect the condition.

How can I achieve this?

This is what I have so far:

const csv = require('csv-parser');
const fs = require('fs');    
exports.checkFileFormat = (file) => {
  let stream = fs.createReadStream(file.tempFilePath)
    .pipe(csv())
    .on('headers', (headers) => {
      /*...some logic...*/
    })
    .on('data', (row) => {
      if (!typeof (row["USAGE"]) == 'number'
          || !moment(row["START_DATE"], 'YYYYMMDD', true).isValid()
          || !moment(row["END_DATE"], 'YYYYMMDD', true).isValid()) {
        stream.unpipe(csv());
        return false;
      }       
    })
    .on('end', () => {
       console.log('CSV file successfully processed');
    });
    return true;
}

In a previous version I had also declared: var num = 100 and tested it inside .on('data', (row) => {...} but it didn't work.

F. Müller
  • 3,969
  • 8
  • 38
  • 49
llandino
  • 75
  • 1
  • 7
  • make the function `checkFileFormat ` return a promise. Inside the promise, `resolve(false)` instead of `return false` and `resolve(true)` in the `'.on('end')` callback. I'm not completely sure this will work, but that's how I would approach it – TKoL Dec 07 '20 at 15:35
  • I've personally tested my answer and it works as expected. The only problem was in the `stream.close()` function, which apparently doesn't exist – TKoL Dec 07 '20 at 15:44

2 Answers2

2

Following up from my comment

make the function checkFileFormat return a promise. Inside the promise, resolve(false) instead of return false and resolve(true) in the '.on('end') callback. I'm not completely sure this will work, but that's how I would approach it

const csv = require('csv-parser');
const fs = require('fs');

exports.checkFileFormat = (file) => {
    return new Promise((resolve, reject) => {
        let stream = fs.createReadStream(file.tempFilePath)
            .pipe(csv())
            .on('headers', (headers) => {
                /*...some logic...*/
            })
            .on('data', (row) => {
                if (!typeof (row["USAGE"]) == 'number'
                    || !moment(row["START_DATE"], 'YYYYMMDD', true).isValid()
                    || !moment(row["END_DATE"], 'YYYYMMDD', true).isValid()) {
                    stream.end(); // stream.unpipe(csv());
                    resolve(false);
                }
            })
            .on('end', () => {
                console.log('CSV file successfully processed');
                resolve(true);
            });
    });    
}
TKoL
  • 13,158
  • 3
  • 39
  • 73
  • Out of curiousity - does this really work? You're unpiping a new `csv`-stream from `stream` when the condition is reached. – eol Dec 07 '20 at 15:57
  • 1
    well i'm not sure about the `unpipe` part, that was included in the OPs original code so I assumed he figured that part out himself. Everything else about this works, namely the `Promise` mechanisms. I've used this code in a real example and it spits out the correct results. But no, I don't know if `unpipe(csv())` is the right thing to do there to close the stream. – TKoL Dec 07 '20 at 16:00
  • Thank you for your hint. I'm trying to combine the Promise with @eol code. Indeed `unpipe(csv())` doesn't work as expected, when i upload a second file, right after the first one, i get `[ERR_STREAM_WRITE_AFTER_END]: write after end`. – llandino Dec 07 '20 at 16:55
  • 1
    @llandino for the record, rather than `stream.unpipe(csv())`, I think @eol is right that `stream.end()` is preferrable. Other than that, this answer is correct. I'll edit it to include `stream.end()` – TKoL Dec 07 '20 at 17:01
1

If you want to read a certain amount of lines and then break, you can try the following:

const csv = require('csv-parser');
const fs = require('fs');
let count = 0;
let maxLines = 3;
let fsStream = fs.createReadStream('./data.csv');
let csvStream = csv();

fsStream.pipe(csvStream)
    .on('headers', (headers) => {
        console.log(headers)
    })
    .on('data', (data) => {
        if (count >= maxLines) {
            fsStream.unpipe(csvStream);
            csvStream.end();
            fsStream.destroy();             
        } else {
            console.log(data);
            count++;
        }
    });

Basically you just count each read line and when the max is reached, you unpipe the csv-stream from the fs-stream, then end the csv-stream and finally destroy the fs-stream.

eol
  • 23,236
  • 5
  • 46
  • 64
  • but then he needs to get the result out of his `exports.checkFileFormat` function – TKoL Dec 07 '20 at 16:06
  • 1
    Thank you @eol ! It didn't solve my issue, but it's more correct then my first attempt with a check inside `.on('data', (data) => {...}` – llandino Dec 07 '20 at 17:01