The answer here addresses the issue only in a nested JavaScript context within an HTML attribute context, whereas your question asks specifically about pure HTML context escaping.
In that question, the escaping should be as per the OWASP recommendation for JavaScript:
Except for alphanumeric characters, escape all characters with the \uXXXX unicode escaping format (X = Integer).
Which will already handle &
because it is not alphanumeric.
To answer you question,
from a practical point of view, why wouldn't you escape ampersand?
The HTML representation of &
is &
, so it makes a lot of sense to do that. If you didn't, anytime a user entered &
, <
, or >
into your application, your application would render &
, <
, or >
instead of &
, <
or >
.
An edge case? Definitely. A security concern? It shouldn't be.
From the HTML5 syntax Character references section:
Character references must start with a U+0026 AMPERSAND character (&).
Following this, there are three possible kinds of character
references:
- Named character references
- Decimal numeric character reference
- Hexadecimal numeric character reference
When an &
is encountered:
Switch to the data state.
Attempt to consume a character reference, with no additional allowed
character.
If nothing is returned, emit a U+0026 AMPERSAND character (&) token.
Otherwise, emit the character tokens that were returned.
Therefore, anything after the &
will cause either &
to be output, or the character represented. As the following characters have to be alphanumeric or else they won't be consumed, there is no chance of an escape character (e.g. '
, "
, >
, <
) being consumed and ignored, therefore there is little security risk of an attacker changing the parsing context. However, you never know if there is a browser bug that doesn't quite follow the standard properly, therefore I would always escape &
. Internet Explorer had an issue where you could specify <%
and it would be interpreted as <
allowing the .NET Request Validation from being bypassed for XSS attack vectors. Always better to be safe than sorry.