I am validating a csv file with content like:
TEST;F;12345;0X4321 - 1234 DUMMYTEXT;0X4321 - 1234 TESTTEXT
Until now, the values were seperated by ';' and the method worked like a charm:
private static final String COLUMN_SEPARATOR = ";";
public void validateFile(BufferedReader reader) {
String line = reader.readLine();
while (line != null && result == ValidationResult.VALID) {
//this is broken with tab-stop as COLUMN_SEPARATOR
int matches = StringUtils.countMatches(line, COLUMN_SEPARATOR);
if (matches != getCSVColumnCount() - 1
&& StringUtils.isNotBlank(line)) {
if (matches == 0) {
//MISSING_CSV_COLUMN_SEPERATOR;
} else {
//UNEXPECTED_CSV_COLUMN_COUNT;
}
}
line = reader.readLine();
}
}
As a changed requirement, now I have to handle tab stops as column seperator, while the text can contain whitespaces:
TEST F 12345 0x4321 - 1234 DUMMYTEXT 0x4321 - 1234 TESTTEXT
I changed the following line:
private static final String COLUMN_SEPARATOR = "\\t";
Problem: StringUtils.countMatches(line, "\\t")
cannot find any occurences (returns 0). I don't want to do:
int matches = line.split("\\t").length;
as I am supersticious that it would be a significant performance hit (the csv-files aren't small). Do you know a better way to go?