I'm trying to clean some htmls. I have converted them to xhtml with tidy
$ tidy -asxml -i -w 150 -o o.xml index.html
The resulting xhtml ends up having named entities. When trying xsltproc on those xhtmls, I keep getting errors.
$ xsltproc --novalid -o out.htm t.xsl o.xml
o.xml:873: parser error : Entity 'mdash' not defined
resources to storing data and using permissions — as needed.</
^
o.xml:914: parser error : Entity 'uarr' not defined
</div><a href="index.html#top" style="float:right">↑ Go to top</a>
^
o.xml:924: parser error : Entity 'nbsp' not defined
Android 3.2 r1 - 27 Jul 2011 12:18
If I add --html to the xsltproc it complains on a tag that has name and id attributes with same name (which is valid)
$ xsltproc --novalid --html -o out.htm t.xsl o.xml o.xml:845: element a: validity error : ID top already defined
<a name="top" id="top"></a>
^
The xslt is simple:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" indent="yes" omit-xml-declaration="yes"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="//*[@id=side-nav]"/>
</xsl:stylesheet>
Why doesn't --html work? Why is it complaining? Or should I forget it and fix the entities?