1

I am using the CSV package for Node to parse some CSV files in a project. I need to be able to handle cases with and without a header. So either of:

const withHeader = `h1,h2,h3
d1a,d2a,d3a
d1b,d2b,d3b`;

const withoutHeader = `d1a,d2a,d3a
d1b,d2b,d3b`;

The number of columns and their names are unknown to my application. Either they will be read from the header, or they should be numerically generated, e.g. col0,col1,col2.

This is where I run into a problem. I always want the output of csvParse to be in object literal form. This is easy when the end-user has indicated that the CSV has a header:

> csvParse(withHeader, {columns: true})
[
  { h1: 'd1a', 'h2': 'd2a', 'h3': 'd3a' },
  { h1: 'd1b', 'h2': 'd2b', 'h3': 'd3b' }
]

But when the user indicates that there is not a header row, it doesn't seem to be possible to end-up with the data in object literal form with programatically generated column headers.

The 3 options for columns are boolean | array | function.

  • By supplying false, the data returned is an array of arrays, which I would then need to transform into object literal form. Not ideal!
  • To supply an array of column names, I would already need to know how many columns there are... before it is parsed, which doesn't make sense. I could parse the first row to get the count, then start again supplying the array, but this seems clumsy.
  • I can supply a function which programmatically generates the column keys. E.g. column => column which doesn't help as a) there is no index supplied, and b) this then ignores the first line as it is assumed to be column headers being transformed into the desired column keys.

Is there a way trick to doing this that I've missed? Here are the two ways that seem clumsier and less efficient than necessary.

Parse 1 row, then parse all

// Obviously in actual use I'd handle edge cases
const colCount = csvParse(withoutHeader, {to_line: 1})[0].length;
// 3
const data = csvParse(withoutHeader, {columns: [...Array(colCount).keys()].map(i => `col{i}`)})
/*
[
  { col0: 'd1a', col1: 'd2a', col2: 'd3a' },
  { col0: 'd1b', col1: 'd2b', col2: 'd3b' }
]
*/

Parse into array of arrays, then convert

csvParse(withoutHeader).map(
  row => row.reduce(
    (obj, item, index) => {
      obj[`col${index}`] = item;
      return obj;
    },
    {}
  )
)
/*
[
  { col0: 'd1a', col1: 'd2a', col2: 'd3a' },
  { col0: 'd1b', col1: 'd2b', col2: 'd3b' }
]
*/

To me it would be ideal to be able to specify columns as a function, which was given the column index as an argument instead of a header row.

dpwr
  • 2,732
  • 1
  • 23
  • 38
  • csvParse is more elegant in providing a nested array. Its better and more optimised for further fornatting than an array of json. So I'd advise extracting what you want from the nested array straight into your table. Also the first line of csv is always a header. example column 3 name = result[0][2] – Steve Tomlin Dec 22 '20 at 22:21
  • @SteveTomlin I see what you're saying, but for various reasons, I do need an array of objects. It's not just for table display purposes. What do you mean by "the first line of csv is always a header"? – dpwr Dec 22 '20 at 22:38
  • I think the important question is what do you mean by a table that has no headers? Even if it is a continuation of another table it still needs context, hence every csv file, and any other table file format that the first row is always the header. – Steve Tomlin Dec 23 '20 at 07:42
  • @SteveTomlin What I mean is, plenty of CSV files have no header in them, perhaps because whomever wrote them didn't consider it necessary. Possibly unwise, but it is not required in the format, which is why all CSV libraries support header parsing or treating the zeroeth row as data. – dpwr Dec 23 '20 at 08:58
  • https://en.wikipedia.org/wiki/Comma-separated_values "An optional header record (there is no sure way to detect whether it is present, so care is required when importing)." This is why it is bad practice to have exported data where the first row is not the header. Otherwise it has no context. – Steve Tomlin Dec 23 '20 at 09:42
  • That is one interpretation, but even if it is bad practoce, I am not in control of writing all CSV files, thus my app must be able to cope with CSVs without headers... – dpwr Dec 23 '20 at 09:46

0 Answers0