0

I'm getting following error while reading csv from hdfs-
java.lang.RuntimeException: java.io.IOException: (startline 1) EOF reached before encapsulated token finished
when i looked into csv file found CRLF (newline) within column is causing this.
How to tackle this?

I'm using commons-csv-1.4

insomniac
  • 155
  • 2
  • 15

1 Answers1

1

You can simply use dos2unix command, or change lines with something like

String withoutCRLF = withCRLF.replaceAll("\r\n", "\n");

Alex Baranowski
  • 1,014
  • 13
  • 22
  • Yes this could have been the solution, but my buffer reader's `br.readLine();` is breaking content after encountering CRLF – insomniac Jan 06 '17 at 12:07
  • Are you using BufferedReader (FileReader) to read content of file? What system are you using? I'm using Scientific Linux (RHEL) 7.2 with Java 1.8 and I'm able to read file with BufferedReader no matter is it's CRLF (unix2dos) or normal (dos2unix). – Alex Baranowski Jan 06 '17 at 12:35
  • There [code](http://pastebin.com/fAtT9j4H) is code that I used in both cases working like charm. – Alex Baranowski Jan 06 '17 at 12:39