I'm fairly new to XML and it's not a regular part of my day job. However, I've been attempting to export a large database and import into Microsoft Excel for data processing purposes.
Where I've gotten stuck is that special character coding is not being recognised by Excel. My XML export contains data such as:
– & û Æ
Among others. The error I get is "Reference to undefined entity ndash", etc.
On export the file created a DTD file w/these definitions, but on searching google somewhere mentioned that Excel doesn't support DTD (I was getting an error so I presumed so). So I've tried writing an XSD which defines these items. Which looks something like this:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="https://www.w3schools.com"
xmlns="https://www.w3schools.com"
elementFormDefault="qualified">
<xs:element name="û">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="u"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="–">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="n"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:schema>
But with no luck on import. Anyone have suggests to help a newb?
Bump?
EDIT: I was able to get over this issue, by cheating and simply replacing the HTML codes with Unicode
So:
–
Became:
–
I'd still be interested in figuring out how I could more easily write this up in an XSD schema so all the HTML instances are automatically replaced by Unicode ones? I thought something as simple as:
<xsd:attribute name="ndash" fixed="–"/>
Would work but nope!