4

I'm parsing the the following...

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE tox:message SYSTEM "http://tox.sf.net/tox/dtd/tox.dtd">
<tox:message xmlns:tox="http://tox.sourceforge.net/">
<tox:model owner="scott" package="queue" function="appendFact">
<tox:parameter value="  By John Smith   &ndash; Thu Feb 25, 4:54 pm ET&lt;br&gt;&lt;br&gt;NEW YORK (Reuters) &ndash; Nothing happened today."/>
<tox:parameter value="10245"/>
</tox:model>
</tox:message>

... using saxon9.jar, but got...

org.xml.sax.SAXParseException: The entity "ndash" was referenced, but not declared.

How do I "declare" an entity for a parse? How would I be able to anticipate all the potential entities?

dacracot
  • 22,002
  • 26
  • 104
  • 152

2 Answers2

1

You declare it in a DTD. Since you are using an external DTD, it has to declare it for you. Does tox.dtd contain a declaration for ndash?

If it does not, you need to do something inspired by:

<!DOCTYPE foo [
    <!ENTITY % MathML SYSTEM "http://www.example.com/MathML.dtd">
    %MathML;
    <!ENTITY % SpeechML SYSTEM "http://www.example.com/SpeechML.dtd">
    %SpeechML;
]>

You could use one of the standard XHTML dtds that defines ndash, for example.

If tox.dtd does declare it, then you need a resolver to find it.

bmargulies
  • 97,814
  • 39
  • 186
  • 310
0

I think you should use EntityResolver.

fospathi
  • 537
  • 1
  • 6
  • 7
Santhosh Kumar Tekuri
  • 3,012
  • 22
  • 22