3

I have been making my web-pages use invalid XHTML 1.0 Strict for the benefits of custom entities in my webpages, as well as other extensibility features.

Are there any issues in doing this, or is this a perfectly valid way to write web pages (other than the inability to be displayed in browsers that don't understand the XHTML mime type)?

I am curious if I can push this to wrap bootstrap div hell of my own web pages into meaningful tags using XML technology without the use of javascript to parse custom tags.

In particular, it is very difficult to write valid XHTML because a lot of HTML5 tags such as canvas and nav are not defined as valid elements, and has a lot of strange ways to become invalid despite valid modern web practices. This is even more of an issue since this makes it impossible to use AngularJS directives to create custom tags, or use custom tags to parse using javascript(Since I don't know how to extend the existing XHTML doctype to make it understand those tags to be valid).

Example:

index.php:

<?php header('Content-Type: application/xhtml+xml'); ?>
<!-- Not intended to be validated, but exploit XHTML benefits anyway -->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-strict.dtd"
[
    <!ENTITY page-title "Daily Bits and Bytes">
]>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
  <head>
    <title>&page-title;</title>
  </head>
  <body>
  </body>
</html>

Consideration:

I am considering dropping strict doctype altogether and instead just use my own doctype and send the webpage as application/xhtml+xml. To my understanding, XHTML DTD is not even looked at by modern browsers, and XHTML does not offer any extra entities/definitions that HTML doesn't have by default, so it seems to add no value to the webpage, whereas custom entities do.

eg:

<?php header('Content-Type: application/xhtml+xml'); ?>
<!DOCTYPE my-dtd
[
    <!ENTITY page-title "Daily Bits and Bytes">
]>
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>&page-title;</title>
  </head>
  <body>
  </body>
</html>

Another benefit

Example, creating a custom color attribute for p tag:

index.xhtml:

<?xml-stylesheet type="application/xml" href="style.xsl"?>
<!DOCTYPE my-dtd
[
    <!ENTITY page-title "Daily Bits and Bytes">
]>
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>&page-title;</title>
  </head>
  <body>
    <p color="blue">This paragraph is blue</p>
    <p>hello</p>
  </body>
</html>

style.xsl:

<xsl:stylesheet xmlns="http://www.w3.org/1999/xhtml"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xhtml="http://www.w3.org/1999/xhtml">                
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()" />
    </xsl:copy>
  </xsl:template>   

<xsl:template match="xhtml:p[@color]">  
  <xsl:element name="xhtml:p">
    <xsl:attribute name="style">
    color: 
    <xsl:value-of select="./@color" />
    ;
    </xsl:attribute>
    <xsl:apply-templates select="@*|node()" />
  </xsl:element>  
</xsl:template>  
</xsl:stylesheet>

One more benefit

Custom domain specific elements, eg navigation-bar,

turning

<navigation-bar>
  <link to="someplace">text</link>
</navigation-bar>

to

<nav class="navbar navbar-inverse">
  <ul class="nav navbar-nav>
    <li><a href="someplace">text</a></li>
  </ul>
</nav>

without javascript(still makes page load slower but once the production is over, you can just optimize away the xsl by only serving the transformation results, which is easier than translating jquery/javascript based transformations).

index.xhtml:

<?xml-stylesheet type="application/xml" href="style.xsl"?>
<!DOCTYPE my-dtd
[
    <!ENTITY page-title "Daily Bits and Bytes">
]>
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>&page-title;</title>
    <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" />
  </head>
  <body>
    <navigation-bar>
      <link to="https://stackoverflow.com">StackOverflow</link>
      <link to="https://facebook.com">Facebook</link>
      <link to="https://twitter.com">Twitter</link>
    </navigation-bar>
    <p color="blue">I am red</p>
    <p>hello</p>
  </body>
</html>

style.xsl:

<xsl:stylesheet xmlns="http://www.w3.org/1999/xhtml"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xhtml="http://www.w3.org/1999/xhtml">                
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()" />
    </xsl:copy>
  </xsl:template>

<xsl:template match="xhtml:p[@color]">  
  <xsl:element name="xhtml:p">
    <xsl:attribute name="style">
    color: 
    <xsl:value-of select="./@color" />
    ;
    </xsl:attribute>
    <xsl:apply-templates select="@*|node()" />
  </xsl:element>  
</xsl:template>  

<xsl:template match="xhtml:navigation-bar">
  <xsl:element name="nav">
    <xsl:attribute name="class">navbar navbar-inverse</xsl:attribute>
    <xsl:element name="ul">
      <xsl:attribute name="class">nav navbar-nav</xsl:attribute>
      <xsl:for-each select="current()/xhtml:link">
        <li>
          <xsl:element name="a">
            <xsl:attribute name="href">
              <xsl:value-of select="@to" />
            </xsl:attribute>
            <xsl:value-of select="text()" />
          </xsl:element>

        </li>
      </xsl:for-each>
    </xsl:element>
  </xsl:element>
</xsl:template>
</xsl:stylesheet>
kjhughes
  • 106,133
  • 27
  • 181
  • 240
Dmytro
  • 5,068
  • 4
  • 39
  • 50
  • Any reason you can't avoid custom entities by processing the page server-side and pushing your variables in during render? Or, if you have chosen Angular, make those variables you populate during page load. – aardrian Jun 12 '16 at 18:17
  • @aardrian I could but I'm curious about this practice in particular because it's interesting since I could send the page as a .xhtml and it will work on browsers that don't allow javascript(for pages that make sense without javascript). – Dmytro Jun 12 '16 at 18:19
  • 1
    Will it really work on all browsers (regardless of JS)? Regardless, I would not do it. A well-structured HTML 5 page brings all sorts of free accessibility, SEO, usability, etc. benefits that I didn't even consider going back to a deprecated spec. – aardrian Jun 12 '16 at 18:22
  • @aardrian it definitely won't work on all browsers, I know that IE6 would just try to download application/xhtml+xml as if it was any other binary file, but any modern browser i've seen so far seems to act predictably with this. And by dropping the deprecated spec in favor my own doctype, I remove that issue as well. My question is regarding whether I am missing some fundamental issue with this approach or if this is indeed okay, in which case it seems silly not to use over plain HTML for modern webpages. – Dmytro Jun 12 '16 at 18:25
  • I refer you to the first part of my second sentence. Accessibility alone means I would recommend against it versus HTML5. SEO is another. – aardrian Jun 12 '16 at 18:36
  • If you're using custom additions to the DTD anyway, can I refer you to [this answer of mine](http://stackoverflow.com/questions/37675308/canvas-validation-xhtml/37711314#37711314) that addresses the adding of custom _elements_ to XHTML 1 (such as canvas), which then passes the W3C validator. – Mr Lister Jun 13 '16 at 08:00

2 Answers2

1

No, however...

My entire platform uses XHTML5, that is HTML5 + the XML parser (application/xhtml+xml). There is always a valid way to do code strictly and achieve your goals. See the link in my profile for my site. You can still use everything XHTML / XML along with HTML5. You're lucky I saw this, most people bash XHTML because of the poor path the W3C was going with XHTML 2.0. You'll have to use the not-a-doctype-doctype and you'll be required to still use the XML declaration.

Also you're not doing content negotiation correctly. You need to serve pages as application/xhtml+xml only when the client's user agent explicitely declares support for it. In example IE7's $_SERVER['HTTP_ACCEPT'] header is *.* which total BS because IE7 doesn't support squat-diddly. Also if a client's browser doesn't support application/xhtml+xml then you shouldn't serve the XML declaration (which also triggers quirks mode in older versions of IE).

   if (isset($_SERVER['HTTP_ACCEPT']) && stristr($_SERVER['HTTP_ACCEPT'],'application/xhtml+xml'))
   {
    header('Content-Type: application/xhtml+xml; charset=UTF-8');
    echo '<?xml version="1.0" encoding="UTF-8"?>'."\n";
   }

Make sure that the browser's web development tools (usually "Net" for network requests) show the main page as having a application/xhtml+xml media type/mime/type.

John
  • 1
  • 13
  • 98
  • 177
0

No.

Invalid XHTML is not XHTML. It's something else altogether. It lacks a definition, and isn't really a "thing" on its own -- it's just broken.

If well-formed XML that adheres to a schema is not important to you, then you don't need XHTML. Don't pretend to be using it. Just use HTML5.

kjhughes
  • 106,133
  • 27
  • 181
  • 240
  • If I remove the xhtml xmlns(or change it to another uuid/address), my browser stops rendering the page as html. I am not convinced that it is not xhtml, it isn't a valid existing xhtml schema(strict/transitional/frameset/etc), but it still seems to be xhtml. Sure it lacks definition, but the definition is implied by being html, and browsers couldn't care less about the definition, validators do(which do a really poor job at it especially on pages with scripts). – Dmytro Jun 12 '16 at 23:46
  • You're missing the point of this answer when you say, "it still seems to be xhtml." There is no ***seems*** because there is no XHTML other than **valid** XHTML. – kjhughes Jun 13 '16 at 01:40