I'm using DOM4j
for parsing and writing an XML-Tree which is always in UTF-8.
My XML file includes German Special-Characters. Parsing them is not a problem, but when I'm writing the tree to a file, the special characters are getting converted to � characters.
I can't change the encoding of the XML file as it is restricted to UTF-8.
Code
SAXReader xmlReader = new SAXReader();
xmlReader.setEncoding("UTF-8");
Document doc = xmlReader.read(file);
doc.setXMLEncoding("UTF-8");
Element root = doc.getRootElement();
// manipulate doc
OutputFormat format = new OutputFormat();
format.setEncoding("UTF-8");
XMLWriter writer = new XMLWriter(new FileWriter(file), format);
writer.write(doc);
writer.close();
Expected output
...
<statementText>This is a test!Ä Ü ß</statementText>
...
Actual output
...
<statementText>This is a test!� � �</statementText>
...