Actually, it receives parameters for internal and external entity switched. Method signature:
virtual void doctypeDecl (const xercesc::DTDElementDecl& root,
const XMLCh* const public_id,
const XMLCh* const system_id,
const bool has_internal,
const bool has_external)
For the document with external entity it gets has_internal = true and has_external = false
,
and for the document with internal entity has_internal = false and has_external = true
.
How it was detected
In our project we are using libxsd-3.3.0 which uses xercesc-3.1.4.
In order to patch XXE Vulnerability in libxsd, we applied following solution https://www.codesynthesis.com/pipermail/xsd-users/2015-September/004689.html. In short, solution suggests to subclass xercesc::DOMLSParserImpl and re-implement doctypeDecl method. Inside it we supposed to be notified via bool parameters if document has external(or internal) entity, and will be able to react as desired, such as:
void SecureDOMParser::doctypeDecl (const DTDElementDecl& e,
const XMLCh* const pub_id,
const XMLCh* const sys_id,
const bool hasi,
const bool hase)
{
if (hasi || hase)
ThrowXMLwithMemMgr(RuntimeException, XMLExcepts::Gen_NoDTDValidator, fMemoryManager);
DOMLSParserImpl::doctypeDecl (e, pub_id, sys_id, hasi, hase);
}
Solution provided by the link above, contains examples of internal and external entities, which we reused in our test XMLs.
Original example worked as expected, but, since we only need to avoid external entities, solution was adjusted the in the following way:
void SecureDOMParser::doctypeDecl (const DTDElementDecl& e,
const XMLCh* const pub_id,
const XMLCh* const sys_id,
const bool hasi,
const bool hase)
{
if (hase) // modified here
ThrowXMLwithMemMgr(RuntimeException, XMLExcepts::Gen_NoDTDValidator, fMemoryManager);
DOMLSParserImpl::doctypeDecl (e, pub_id, sys_id, hasi, hase);
}
Turned out in this form it does not throw for the document with external entity. And it does throw if to put if (hasi)
instead.
Same was confirmed after adding additional logging into if clause, such as:
void SecureDOMParser::doctypeDecl (const DTDElementDecl& e,
const XMLCh* const pub_id,
const XMLCh* const sys_id,
const bool hasi,
const bool hase)
{
if (hasi || hase)
{
// added logging
std::cout << "has external: " << hase << " has internal: " << hasi;
ThrowXMLwithMemMgr(RuntimeException, XMLExcepts::Gen_NoDTDValidator, fMemoryManager);
}
DOMLSParserImpl::doctypeDecl (e, pub_id, sys_id, hasi, hase);
}
The output of this logging will be as stated in the beginning - vise versa from what you would expect.
Had anyone else run into this? Is this known bug?