0

I am trying to parse a CSV file which has single quote as text qualifier. The problem here is that some values with single quote text qualifier itself contains single quote e-g:

'Fri, 24 Feb 2017 17:44:57 +0700','th01ham000tthxs','/','','Writer's Tools Data','7.1.0.0',

I am struggling to parse the file as after this row, all of the remaining rows get displaced.

I tried working with OpenCSV, UnivocityParsers but didn't get any luck. If I place the above row in excel (Excel Image) and provide text qualifier as single quote, it give correct result without any displacement of rows.

2 Answers2

0

If using java, the JRecord library should handle the File.

How it works: if a field starts with a quote (e.g. ,') specifically look for ', or ''', or ''''', or ' etc (an odd number of quotes followed by either a comma or end-of-line marker). This approach breaks down if:

  • The embedded quote is the last character in a field i.e. 'Field with quote '',
  • White space between the quote and comma i.e. 'Field' , or , '

Here is the line in ReCsvEditor

ReCsvEditor


Also in the ReCsvEditor when editing the file, if you select Generate >>> Java Code >>> ... it will generate Java/JRecord Code to read the file.

ReCsvEditor Generate

Disclaimer: I am the author of JRecord / ReCvEditor. Also the ReCsvEditor Generate function is new and needs more work

Bruce Martin
  • 10,358
  • 1
  • 27
  • 38
0

Try configuring univocity-parsers to handle the unescaped quote according to your scenario. 'Writer's Tools Data' has an unescaped quote. From your input, I can see you want to use STOP_AT_CLOSING_QUOTE as the strategy to work around these values.

Add this line to your code and it should work fine:

parserSettings.setUnescapedQuoteHandling(UnescapedQuoteHandling.STOP_AT_CLOSING_QUOTE);

Hope this helps.

Jeronimo Backes
  • 6,141
  • 2
  • 25
  • 29