8

I'm using Jackson CSV to parse a CSV file into POJOs. My issue is that if a row in the CSV has too few columns, the parser doesn't complain and just sets the rest of the fields to null.

Parsing code:

    CsvMapper csvMapper = new CsvMapper();
    csvMapper.addMixInAnnotations(Person.class, PersonCsvMixin.class);
    CsvSchema schema = csvMapper.schemaFor(Person.class).withHeader();
    MappingIterator<Person> it = csvMapper.reader(dataClass).with(schema).readValues(csv);
    LinkedList<Person> output = new LinkedList<>();

    while(it.hasNext()) {
        output.push(it.next());
    }

Mixin:

import com.fasterxml.jackson.annotation.*;

@JsonPropertyOrder(value = { "FirstName", "LastName", "Title"})
public abstract class Person {
    @JsonProperty("LastName")
    public abstract String getLastName();
    @JsonProperty("FirstName")
    public abstract String getFirstName();
    @JsonProperty("Title")
    public abstract String getTitle();
}

Data class:

public class OfficespaceInputEmployee implements Serializable{
    protected String firstName;
    protected String lastName;
    protected String title;
    // .. getters and setters
}

If I parse a file like the following, no errors occur even though the middle record is missing two fields. Instead, LastName and Title become null

"FirstName", "LastName", "Title"
"John", "Smith", "Mr"
"Mary"
"Peter", "Jones", "Dr"

Is there a feature to enable that will cause this to error instead?

rewolf
  • 5,561
  • 4
  • 40
  • 51

3 Answers3

9

I know this is an old thread, but as I run into the same question myself, let me share the solution :

csvMapper.configure(CsvParser.Feature.FAIL_ON_MISSING_COLUMNS, true); will do the trick.

Kari Sarsila
  • 232
  • 2
  • 11
  • I've got this nice setup: myObjectReader = csvMapper .readerFor(Map.class) .with(csvSchema) .without(IGNORE_TRAILING_UNMAPPABLE) .without(ALLOW_TRAILING_COMMA) .with(FAIL_ON_MISSING_COLUMNS); – RichColours Jun 29 '20 at 16:01
3

You can throw an exception yourself when you build the output LinkedList inside the while loop:

while(it.hasNext()) {
    Person line = it.next();
    //call a method which checks that all values have been set 
    if (thatMethodReturnsTrue){
        output.push(line);
    } else{
      throw SomeException();
    }
}
Laurentiu L.
  • 6,566
  • 1
  • 34
  • 60
  • 2
    Yes, thanks. I know I could do that, but I was hoping that the library would actually check this at parse-time, as it does when there are too many fields. – rewolf Jun 11 '15 at 08:41
  • I haven't found any option you can pass to make the parser strict. I think there may not be one – Laurentiu L. Jun 11 '15 at 08:41
  • 1
    @rewolf would've been useful to have a CsvParser.Feature that you can enable on the CsvMapper. You can add a suggestion on github – Laurentiu L. Jun 11 '15 at 08:45
  • 1
    The CsvMapper's base class ObjectMapper has a configure method which has some useful configurations but i don't think it applies to the csv parser. It would be useful to have for csv parsing a feature similar to this one for deserializaiton: csvMapper.configure(DeserializationFeature.ACCEPT_EMPTY_STRING_AS_NULL_OBJECT, false) – Laurentiu L. Jun 11 '15 at 08:52
  • 1
    There isn't currently a way to do this in the library, so I will add the check when iterating, and mark this as the answer. Thanks :) – rewolf Jun 12 '15 at 04:21
2

I would suggest filing an RFE for issue tracker, for something like CsvParser.Feature.REQUIRE_ALL_COLUMNS: if enabled, parser would throw an exception to indicate that one or more of expected columns are missing. This sounds like a useful addition to me.

StaxMan
  • 113,358
  • 34
  • 211
  • 239