2

In JSP file, the source code is

|&#x0031;&#x0080;&#x0033;|<%="\u0031\u0080\u0033" %>|

The result on the page is:

|1€3|13|

Why is the Euro symbol represented differently ?

Andrea Ligios
  • 49,480
  • 26
  • 114
  • 243
Evgeny
  • 2,121
  • 1
  • 20
  • 31

1 Answers1

2

The HTML numerical character references in the range 0x80–0x9F don't actually correspond to the characters U+0080–U+009F. Instead, they refer to the characters mapped into the bytes 0x80–0x9F from the windows-1252 encoding.

This is a weird historical artefact from the days before browsers did Unicode. HTML5 sort-of standardises it, in that although it's invalid parsers are required to parse it this way. This does not happen in XML/XHTML.

So \u0080 gives you the actual character U+0080, which you can't see because it's an invisible control character, but &#x0080; gives you code page 1252 byte 0x80, which is U+20AC Euro Sign.

bobince
  • 528,062
  • 107
  • 651
  • 834