I am interested in the very richly semantic and XML-based TEI language, but I believe that if it could be encoded in a round-trippable manner with HTML, that it could thereby benefit from being creatable in web-based HTML editors or storeable on HTML-based wikis (at least those which supported the necessary semantic mechanisms), etc.
I would like to know whether RDFa would work as a mechanism for fully representing an XML dialect (or multiple ones) within HTML5, with the standard being round-trippability and awareness of the hierarchical nature of XML elements (and its other critical aspects like attributes).
I know one might be able to overload data-* attributes, Microformats, or Microdata, but none of these options allows for something which can both fully represent an XML dialect with its hierarchical nature while also being free of spec warnings about the mechanism not intended to be used by software independent of the site (e.g., if one wished to create a search engine to search such embedded XML in a hierarchically-aware manner).
If RDFa won't work, I think the best option might be data-* attributes, as one can easily do something like this to represent XML:
<div data-xml-ns="http://www.tei-c.org/ns/1.0"
data-xml-ns="html:http://www.w3.org/1999/xhtml" data-xml-element="div1"
data-xml-attribute-value="xml:id=myDiv1ID">Some TEI div1 content
and <div data-xml-element="div2">some div2 content</div></div>
(Not a good example of semantic richness I know, but just showing the nature of encoding.)
But again, I'd prefer to avoid the limitations placed on this mechanism as stated in the HTML spec:
"These attributes are not intended for use by software that is independent of the site that uses the attributes"
"these attributes are intended for use by the site's own scripts, and are not a generic extension mechanism for publicly-usable metadata."
If RDFa will work for this, I would appreciate an example of how, e.g., the example above might be encoded to preserve the hierarchical relationships, etc.