4

I have a csv file which contains 5 fields with 1 field having embedded newlines. I can read the csv file perfectly using CSVReader [OpenCSV]. I am also able to get individual fields in spite of the embedded newlines. But I want to write another csv file which contains all the fields in the same way but want to ignore ONLY "embedded newlines" and not the traditional end of row newlines. Can someone tell me how can I achieve this?

I am using the code below, but somehow I am still not able to replace "\n" with "". The output of System.out.println(tempLine[0]); still contains embedded newline.

CSVReader reader = new CSVReader(new FileReader(INPUT_FILE), ',');  
CSVWriter writer = new CSVWriter(new FileWriter(OUTPUT_FILE), ',');  
String [] nextLine;  
String [] tempLine = new String[1];    
while ((nextLine = reader.readNext()) != null)  
{  
   System.out.println("Tweet: " + nextLine[3] + "\nSentiment: " +  nextLine[4]);  
   tempLine[0] = nextLine[3].replace("\\n", "");  
   System.out.println(tempLine[0]);  
   writer.writeNext(tempLine);    
}

Thank you for your help!

mihirk
  • 93
  • 1
  • 3
  • 10

2 Answers2

2

After reading in a line, examine each field and remove any newlines you find.

String[] newFields = new String[fields.length];
i=0;
for (String field : fields)
{
    newFields[i++] = field.replace("\\n","");
}

Then write the newFields back out using OpenCSV.

Jim Garrison
  • 85,615
  • 20
  • 155
  • 190
  • The `replace()` method does not update the input parameter, which I kind of glossed over in my pseudo-java example. It returns a NEW string containing the result. I have updated my answer. – Jim Garrison Feb 08 '12 at 06:50
  • The answer lies in the correct representation of "newline". Here in this code newline should be represented as just "\n" because "\n" translates to a real new line by the compiler. So the new line character is send to the regex engine and we get the correct replacement. On the other hand, "\\n" translates to "\n" after the compiler is done and hence gives us the wrong output. This will solve the problem: tempLine[0] = nextLine[3].replace("\\n", ""); – mihirk Feb 08 '12 at 07:46
1

Use a util method like below. Slightly modified @Jim Garrison's ans. Changed "\\n" to "\n"

    private static String[] cleanNewLine(String[] fields) {
        String[] newFields = new String[fields.length];
        int i = 0;
        for (String field : fields) {
            if(field != null)
                newFields[i] = field.replace("\n", "");
            i++;
        }
        return newFields;
    }
Sakthivel
  • 576
  • 8
  • 15