1

I have a file with Chinese characters text inside, I want to copy those text over to another file. But the file output messes with the chinese characters. Notice that in my code I am using "UTF8" as my encoding already:

BufferedReader br = new BufferedReader(new FileReader(inputXml));
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
sb.append(line);
sb.append("\n");
line = br.readLine();
}
String everythingUpdate = sb.toString();

Writer out = new BufferedWriter(new OutputStreamWriter(
        new FileOutputStream(outputXml), "UTF8"));

out.write("");
out.write(everythingUpdate);
out.flush();
out.close();
ianrey palo
  • 61
  • 1
  • 10
  • 2
    Was your input file encoded in UTF-8? Did the FileReader uses UTF-8 when you check getEncoding()? How did you check the output, did your text viewer support UTF-8? – gerrytan Mar 13 '13 at 05:36
  • 2
    Read the input file using the encoding which it uses. You can check a file's encoding in many editors. – longhua Mar 13 '13 at 05:39

2 Answers2

3

The answer from @hyde is valid, but I have two extra notes that I will point out in the code below.

Of course it is up to you to re-organize the code to your needs

// Try with resource is used here to guarantee that the IO resources are properly closed
// Your code does not do that properly, the input part is not closed at all
// the output an in case of an exception, will not be closed as well
try (BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(inputXML), "UTF-8"));
    PrintWriter out = new PrintWriter(new OutputStreamWriter(new FileOutputStream(outputXML), "UTF8"))) {
    String line = reader.readLine();

    while (line != null) {
    out.println("");
    out.println(line);

    // It is highly recommended to use the line separator and other such
    // properties according to your host, so using System.getProperty("line.separator")
    // will guarantee that you are using the proper line separator for your host
    out.println(System.getProperty("line.separator"));
    line = reader.readLine();
    }
} catch (IOException e) {
  e.printStackTrace();
}
Waleed Almadanat
  • 1,027
  • 10
  • 24
2

You should not use FileReader in a case like this, as it does not let you specify input encoding. Construct an InputStreamReader on a FileInputStream.

Something like this:

BufferedReader br = 
        new BufferedReader(
            new InputStreamReader(
                new FileInputStream(inputXml), 
                "UTF8"));
hyde
  • 60,639
  • 21
  • 115
  • 176