3

I would like to process CSV file of such structure:

header1,header2
val1.1, val1.2
val2.1, val2.2

But only if the first line contains both header names - otherwise throw an exception.

My current implementations using Apache Common CSV is:

Reader reader = new InputStreamReader(new ByteArrayInputStream(file.getContent()));

CSVParser csvParser = new CSVParser(reader, CSVFormat.DEFAULT
            .withHeader("header1", "header2")
            .withSkipHeaderRecord());

for (CSVRecord csvRecord : csvParser) { /* records processing */ }

The problem is that the first line might have values different than header names and the file is still processed.

samabcde
  • 6,988
  • 2
  • 25
  • 41
mar3g
  • 101
  • 1
  • 7
  • Refer: https://commons.apache.org/proper/commons-csv/apidocs/org/apache/commons/csv/CSVFormat.html#withSkipHeaderRecord-- , you are skipping the header records anyway. – Ironluca Jan 15 '21 at 08:22
  • The thing is I don't want to process the headers as record, but I want to make sure they have the values I want. – mar3g Jan 15 '21 at 08:40

1 Answers1

0

Referring to the Java Doc of CSVFormat:

Referencing columns safely

If your source contains a header record, you can simplify your code and safely reference columns, by using withHeader(String...) with no arguments:

 CSVFormat.EXCEL.withHeader();

This causes the parser to read the first record and use its values as column names. Then, call one of the CSVRecord get method that takes a String column name argument:

 String value = record.get("Col1");

This makes your code impervious to changes in column order in the CSV file.


So you can just follow this and use first line as header, then validate headers CSVParser#getHeaderNames.

Following is a simple demonstration:

import java.io.IOException;
import java.io.Reader;
import java.io.StringReader;
import java.util.ArrayList;
import java.util.List;

import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;

public class UseFirstRowAsHeader {
    public static void main(String[] args) throws IOException {
        String validHeaderCsv = "header1,header2\r\n"
                + "val1.1,val1.2\r\n"
                + "val2.1,val2.2";
        parseWithHeaderValidation(validHeaderCsv);
        String invalidHeaderCsv = "header1,header2,header3\r\n"
                + "val1.1,val1.2\r\n"
                + "val2.1,val2.2";
        parseWithHeaderValidation(invalidHeaderCsv);
    }

    private static void parseWithHeaderValidation(String validHeaderCsv) throws IOException {
        Reader reader = new StringReader(validHeaderCsv);
        List<String> expectedHeaders = new ArrayList<String>();
        expectedHeaders.add("header1");
        expectedHeaders.add("header2");
        try (CSVParser csvParser = new CSVParser(reader, CSVFormat.DEFAULT
                .withHeader().withAllowMissingColumnNames(false)
                .withSkipHeaderRecord())) {
            if (!csvParser.getHeaderNames().equals(expectedHeaders)) {
                throw new IllegalStateException("Not expected headers" + csvParser.getHeaderNames());
            }

            for (CSVRecord csvRecord : csvParser) {
                System.out.println(csvRecord.get("header1") + "," + csvRecord.get("header2"));
            }
        }
    }
}
samabcde
  • 6,988
  • 2
  • 25
  • 41