3

I use the non-validating read for displaying or processing un-trusted XML documents where I do not need support for internal entities but I do want to be able to process then even if a DOCTYPE is shown.

With the disallow DOCTYPE-decl feature of SAX I can make sure parsing a XML document has no risk of external entities or billion laughter DOS expansions. This is also recommended by the OWASP XXE prevention cheat-sheet.

XMLReader reader = XMLReaderFactory.createXMLReader();
reader.setFeature("http://apache.org/xml/features/continue-after-fatal-error", true);

reader.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);

// or
reader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
reader.setFeature("http://xml.org/sax/features/external-general-entities", false);    
reader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);

However unfortunately this aborts the parsing when a DOCTYPE is given:

org.xml.sax.SAXParseException; systemId: file:... ; lineNumber: 2; columnNumber: 10;
    DOCTYPE is disallowed when the
    feature "http://apache.org/xml/features/disallow-doctype-decl" set to true.

And if I ignore this fatal error, then it will happily resolve internal entities, as you can see here: https://gist.github.com/ecki/f84d53a58c48b13425a270439d4ed84a

I wonder, is there a combination of features so I can read over but not evaluate the doctype declaration (especially avoiding recursive expansion).

I am looking to avoid defining my own Apache specific security-manager property or a special resolver.

eckes
  • 10,103
  • 1
  • 59
  • 71

1 Answers1

2

According to core-lib-dev the XMLReaderFactory will be deprecated in Java 9 and the way to obtain a XMLReader will be to use a SAX Parser.

In that case FSP can be used (which esablishes some resource limits as well as removes remote schema handlers for ACCESS_EXTERNAL_DTD and _SCHEMA):

SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setXIncludeaware(false);
// when FSP is activated explicit it will also restrict external entities
spf.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
XMLReader reader = spf.newSAXParser().getXMLReader();
eckes
  • 10,103
  • 1
  • 59
  • 71
  • 1
    If you downloaded the Xerces library (as opposed to relying on the JDK's internal Xerces library), **this code won't prevent an XXE attack**. The JDK version is actively maintained to include new security features. If you must use the downloaded version, and you need to process XML with DOCTYPE declarations, you'll have to explicitly deny access to external entities as well as use a [SecurityManager](https://xerces.apache.org/xerces2-j/javadocs/xerces2/org/apache/xerces/util/SecurityManager.html) to prevent entity expansion attacks. – blurredd Oct 30 '16 at 00:59
  • @blurredd thatns for the note, I am not sure myself what would be better, In a controled environment some asserts would probably be enough. To support multiple parsers setting the external access schema to empty might not be enough (and security manager is XNI specific). I guess I will go with hardcoding the implementation. – eckes Oct 31 '16 at 16:51