0

In the W3C HTML 4.01 DTDs and earlier, inline comments are frequently used within declarations.

For example, the HTML 2.0 Strict DTD has:

<!ENTITY % HTML.Version
    "-//IETF//DTD HTML 2.0 Strict//EN"

        -- Typical usage:

            <!DOCTYPE HTML PUBLIC
        "-//IETF//DTD HTML Strict//EN">
        <html>
        ...
        </html>
    --
    >

where the HTML entity declaration contains a comment between two double hyphens --.

However, DTD validators seem to flat out reject these sorts of internal comments and throw an error.

Are the validators wrong, or are the W3C DTDs not well-formed?


Answer:

In looking into it further, it seems that this is due to differences between the SGML and XML specifications.

Essentially, SGML defines comments as beginning and ending with -- anywhere inside a declaration construct <! >, whereas XML requires comments to begin and end with the <!-- and --> delimiters, respectively, as independent constructs.

Because HTML up to version 4.01 was based on SGML, comments within declarations were allowed and were used by the official DTDs.

However, most DTD validators seem to only check for compliance with the simpler XML specification and, therefore, get confused by intra-declaration comments, barfing errors.

user339676
  • 151
  • 5
  • This should be valid, but only in SGML syntax (not XML). Which parser are you using? (just to remind: HTML is an SGML DTD) – potame Oct 20 '15 at 08:21
  • It seems you're right. I was using the [Validome validator](http://www.validome.org/grammar/), which is only for validating XML DTDs. (I didn't know the DTD specifications for XML and SGML were different when I wrote the question). Thanks! – user339676 Oct 20 '15 at 08:34

1 Answers1

1

In looking into it further, it seems that this is due to differences between the SGML and XML specifications.

Essentially, SGML defines comments as beginning and ending with -- anywhere inside a declaration construct , whereas XML requires comments to begin and end with the delimiters, respectively, as independent constructs.

Because HTML up to version 4.01 was based on SGML, comments within declarations were allowed and were used by the official DTDs.

However, most DTD validators seem to only check for compliance with the simpler XML specification and, therefore, get confused by intra-declaration comments, barfing errors.

user339676
  • 151
  • 5