13

When I'm reading a csv-file using opencsv it doesn't work properly when encountering a '\' at the end of a string. It makes the " part of the string, instead of the '\' as I want to. I guess there must be some method to add another '\' to have it escape the '\'-character instead? Without having to manually edit the csv-file. I have searched but not found anything.

To clarify my problem, it looks like this:

csv-file

"A",       "B",        "C",       "D"
"value 1", "value 2",  "value 3", "value 4"
"value 5", "value 6\", "value 7", "value 8"

My code looks like this (not really, but it shows my problem):

String inFile = "in.csv";
CSVReader reader = new CSVReader(new FileReader(inFile));
String[] line;

while ((line = reader.readNext()) != null) {
    for (int i = 0; i < line.length(); i++) {
        System.out.println(i + " " + line[i]);
    }
}

I want this to parse into a String[] with 4 elements each, for each row, but the last row parses only into two elements, as shown in the output below.

1 A
2 B
3 C
4 D
1 value 1
2 value 2
3 value 3
4 value 4
1 value 5
2 value 6",value 7,value 8

I have tried to change the reader to:

CSVReader reader = new CSVReader(new InputStreamReader(new FileInputStream(inFile), "UTF-8"));

but without any luck.

Christoffer Karlsson
  • 4,539
  • 3
  • 23
  • 36

3 Answers3

16

Maybe change the escape character in the constructor of the Reader?

CSVReader(new InputStreamReader(new FileInputStream(inFile)), ',', '"', '|') 

This is assuming | is not used in your CSV file

user000001
  • 32,226
  • 12
  • 81
  • 108
8

More cleaner and recommended solution is to use RFC4180Parser instead of default CSVParser:

String csv = "come,csv,string";
RFC4180Parser rfc4180Parser = new RFC4180ParserBuilder().build();
CSVReader csvReader = new CSVReaderBuilder(new StringReader(csv)).withCSVParser(rfc4180Parser).build(); 

Reference: https://sourceforge.net/p/opencsv/support-requests/50/

vatsal mevada
  • 5,148
  • 7
  • 39
  • 68
4

The backslash is for escaping the " because some values may contain a " character, and without the backslash you would not be able to have the character included.

So if you want to use \ you need to escape it with \ too, just like you would do to have it in a regular Java String.

"A",       "B",         "C",       "D"
"value 1", "value 2",   "value 3", "value 4"
"value 5", "value 6\\", "value 7", "value 8"

Either you modify your CSV file or you use another constructor from CSVReader from which you can choose the escape character

Alex
  • 25,147
  • 6
  • 59
  • 55
  • User explicitly states they cannot modify the csv. Imagine this is dirty data coming in from an outside source. –  Jul 26 '16 at 23:16