1

I am trying to load an XHTML 1.1 document using XDocument (LINQ-to-XML) in a portable class library, but am receiving an exception where it can't parse the entities:

System.Xml.XmlException: Reference to undeclared entity 'nbsp'. Line 10, position 23.
   at System.Xml.XmlTextReaderImpl.Throw(Exception e)
   at System.Xml.XmlTextReaderImpl.HandleGeneralEntityReference(String name, Boolean isInAttributeValue, Boolean pushFakeEntityIfNullResolver, Int32 entityStartLinePos)
   at System.Xml.XmlTextReaderImpl.HandleEntityReference(Boolean isInAttributeValue, EntityExpandType expandType, Int32& charRefEndPos)
   at System.Xml.XmlTextReaderImpl.ParseText(Int32& startPos, Int32& endPos, Int32& outOrChars)
   at System.Xml.XmlTextReaderImpl.ParseText()
   at System.Xml.XmlTextReaderImpl.ParseElementContent()
   at System.Xml.XmlTextReaderImpl.Read()
   at System.Xml.Linq.XContainer.ReadContentFrom(XmlReader r)
   at System.Xml.Linq.XContainer.ReadContentFrom(XmlReader r, LoadOptions o)
   at System.Xml.Linq.XDocument.Load(XmlReader reader, LoadOptions options)
   at System.Xml.Linq.XDocument.Load(Stream stream, LoadOptions options)
   at ShatteredTemple.EpubTweaker.EpubTweaker.LoadXDocumentFromZipFile(ZipFile epub, ZipEntry zipEntry) in [path omitted]\EpubTweaker.cs:line 159

(SO wouldn't let me post that in <pre> tags - apologies for the faux code formatting.)

Normally I would try embedding the DTD per this question. However, that requires creating a custom XmlResolver, which does not seem to be available to a PCL.

Using XmlReaderSettings to try to disable CheckCharacters and DtdProcessing doesn't do anything.

Any suggestions for how I can handle "unknown" entities when parsing XML inside a PCL? I'd prefer to keep using LINQ-to-XML/XDocument, but am potentially open to other XML parsers, as long as they can add an XML declaration (<?xml version="1.0" encoding="utf-8"?>) to the document if one is missing.

Community
  • 1
  • 1
Andrew Timson
  • 428
  • 7
  • 13
  • The error says it is on line 10 (not line 1) : 'nbsp'. Line 10, position 23 – jdweng May 08 '16 at 17:50
  • I'm not sure what that has to do with my question. Yes, the offending entity is on line 10 of the XHTML file. But it's not something that needs to be "fixed" in the XHTML - it's a perfectly valid XHTML 1.1 input, but XDocument is failing to parse it because it can't load the DTD. – Andrew Timson May 08 '16 at 17:54
  • That is what I thought but didn't want to jump to any conclusion. The Net library doesn't work with xml 1.1 (only 1,0). Posting the exception in tis case leads to the conclusion that there is an error with the xml. I would still double check with an one-line xml checker to make sure the xml is valid. – jdweng May 08 '16 at 21:26
  • Out of the box, no, it doesn't. If you're using the full framework, you can use a custom XmlResolver to support it, per the question I linked to. But since that doesn't seem to be available in a PCL... I thought I'd ask this question to see if I'm missing any available PCL-compatible alternatives. – Andrew Timson May 08 '16 at 22:21
  • @jdweng this isn't *XML* 1.1, it's *XHTML* 1.1. XHTML is still XML 1.0. The error here is pretty common as the XHTML DTDs define various entities like `nbsp`. – Charles Mager May 09 '16 at 08:14

0 Answers0