5

With the new doctype and elements that are part of HTML5, how do you get xdmp:tidy() to recognize those in HTML5?

If I have an html page that contains something like:

<!DOCTYPE html>
<html>
    <header>blah</header>
    <section>blah</section>

and then try something like: xdmp:tidy(xdmp:document-get("home.html"))

I get errors like:

<section> is not recognized! discarding unexpected <section>
<header> is not recognized! discarding unexpected <header>

Are there some options I can send to xdmp:tidy() to get it to handle it?

Matt
  • 13,948
  • 6
  • 44
  • 68
RyanS
  • 51
  • 2

2 Answers2

1

Try using the new-blocklevel-tags option that specifies the new HTML5 tags. You can include multiple elements by separating them with a comma or space. You should get the expected output with no errors, but there will still be warnings.

Try this in cq:

xdmp:tidy(xdmp:document-get("home.html"), <options xmlns="xdmp:tidy"><new-blocklevel-tags>header section</new-blocklevel-tags></options>)

Click here for a good reference about adding various tags (block level, inline, empty) that should work as options in xdmp:tidy. The same information is here, but it's a bit harder to get to, there's so many options!

Dyne
  • 21
  • 1
  • 2
    I tried your suggestion but now I get: ` <section> is not approved by W3C ` <header> is not approved by W3C – RyanS Sep 06 '11 at 20:08
1

The rest of this discussion moved over to the marklogic mailing list at http://markmail.org/thread/emwua43mg63wxbno


This does produce warnings but seems to work nonetheless:

xquery version "1.0-ml";

let $htmlstring :=
'<html>
    <header>blah</header>
    <section>blah</section>
<p>hello</p>
</html>'
return
xdmp:tidy($htmlstring,
<options xmlns="xdmp:tidy">
  <new-inline-tags>header section</new-inline-tags>
  <new-blocklevel-tags>header section</new-blocklevel-tags>
</options>)
j0k
  • 22,600
  • 28
  • 79
  • 90
Eric Bloch
  • 2,882
  • 2
  • 20
  • 26