0

I have a tab delimited text file which I want to parse using openscsv and upload to a database. I used CSVReader() to parse the file. The problem is, some column values have tabs within. For instance, a column ends with a tab, and then it has another tab which is used for separating it from the next column.

I'm having trouble in parsing this file. How do I avoid delimiters which are as part of the value?

This is the file I'm trying to parse. Each line has 2 columns and there are 5 rows in total. The first row is the header. However, when I parse it using the following code, I get only 3 rows:

CSVReader reader = new CSVReader(new FileReader("input.txt"), '\t');
String[] nextLine;
int cnt = 0;
while ((nextLine = reader.readNext()) != null) {
    if (nextLine != null) {
        cnt++;
        System.out.println("Length of row "+cnt+" = "+nextLine.length);
        System.out.println(Arrays.toString(nextLine));
    }
}

******** Update ********

Doing a normal readline such as below prints 5 lines:

BufferedReader br = new BufferedReader(new FileReader("input.txt"));
int lines = 0;
while(br.readLine() != null){
    lines++;
}
System.out.println(lines);
drunkenfist
  • 2,958
  • 12
  • 39
  • 73

1 Answers1

0
  1. Put quotes on your data - here is a modified unit test from CSVReaderTest that shows quotes will work:

    @Test
    public void testSkippingLinesWithDifferentEscape() throws IOException
    {
    
        StringBuilder sb = new StringBuilder(CSVParser.INITIAL_READ_SIZE);
        sb.append("Skip this line?t with tab").append("\n");   // should skip this
        sb.append("And this line too").append("\n");   // and this
        sb.append("a\t'b\tb\tb'\t'c'").append("\n");  // single quoted elements
        CSVReader c = new CSVReader(new StringReader(sb.toString()), '\t', '\'', '?', 2);
    
        String[] nextLine = c.readNext();
    
        assertEquals(3, nextLine.length);
    
        assertEquals("a", nextLine[0]);
        assertEquals("b\tb\tb", nextLine[1]);
        assertEquals("c", nextLine[2]);
    }
    

If that does not work please post some of the lines from your input.txt. When I click on the link it takes me to some website trying to sell me a dropbox clone.

Scott Conway
  • 975
  • 7
  • 13
  • Thx, will try your solution. I uploaded the file because I didn't want to lose the original formatting. When you click on the link, you'll have an option to "download this file". Clicking on it will download the txt file. – drunkenfist Jun 05 '15 at 08:25
  • Got the file! My apologies but I could not see the download file button admist all the other "download app" buttons :) I will try and add a unit test to test this out. – Scott Conway Jun 05 '15 at 15:27
  • Okay - got a good look at your file. If you have a field that has a delimiter in it you have to have quotes around it (in your case you set the single quote as the quote character) like the example I posted above. Otherwise how would the code know what is part of the field and what is a delimiter? – Scott Conway Jun 09 '15 at 12:29