I have noticed that for Android 4.4 handsets, saving a webview with:
webview.saveWebArchive(name);
and reading it after with WebArchiveReader WebArchiveReader (see code below) throws an Encoding Exception:
11-08 15:10:31.976: W/System.err(2240): org.xml.sax.SAXParseException: Unexpected end of document 11-08 15:10:31.976: W/System.err(2240): at org.apache.harmony.xml.parsers.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:125)
The method used to read the stored XML file worked perfectly fine until 4.3 and it is (NOTE: I tried to parse it in two different ways):
public boolean readWebArchive(InputStream is) {
DocumentBuilderFactory builderFactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder builder = null;
myDoc = null;
try {
builder = builderFactory.newDocumentBuilder();
} catch (ParserConfigurationException e) {
e.printStackTrace();
}
try {
//New attempt
InputSource input = new InputSource(is);
input.setEncoding("UTF-8");
myDoc = builder.parse(input);
//This used to be the way it used to work for
//Android 4.3 and below without trouble
//myDoc = builder.parse(is);
NodeList nl = myDoc.getElementsByTagName("url");
for (int i = 0; i < nl.getLength(); i++) {
Node nd = nl.item(i);
if(nd instanceof Element) {
Element el = (Element) nd;
// siblings of el (url) are: mimeType, textEncoding, frameName, data
NodeList nodes = el.getChildNodes();
for (int j = 0; j < nodes.getLength(); j++) {
Node node = nodes.item(j);
if (node instanceof Text) {
String dt = ((Text)node).getData();
byte[] b = Base64.decode(dt, Base64.DEFAULT);
dt = new String(b);
urlList.add(dt);
urlNodes.add((Element) el.getParentNode());
}
}
}
}
} catch (SAXParseException se){
//Some problems parsing the saved XML file
se.printStackTrace();
myDoc = null;
} catch (Exception e) {
e.printStackTrace();
myDoc = null;
}
return myDoc != null;
}
I've played a bit with the way the buider is invoked. Instead of giving it a FileInputStream, I first create an InputSource as you can see to force a given encoding. However, I had no success. By not including the InputSource, the exception was instead:
org.xml.SAXParseException: Unexpected token
I've read in previous posts that this may be an encoding issue (e.g. android-utf-8-file-parsing) but none of the proposed solutions worked for me.
Does anyone else have the same issue or does anyone know what has changed on Kit Kat, and if so, how could it be avoided?
Many thanks in advance