0

I try to save an HTML page in Windows-1255, using Notaped++. I chose the relevant encoding (Encoding -> Character sets -> Hebrew -> Windows-1255) and defined the charset in the HTML header as:

<meta charset="windows-1255">.

but..it seems as if the page is still in UTF-8.

Checking code validation (in https://validator.w3.org) spotted errors:

  • Error: Legacy encoding Windows-1255 used. Documents must use UTF-8.

  • Error: The only allowed value for the charset attribute for the meta element is utf-8.

And the rendering suggest the same (symbols are displayed like in UTF-8).

Any suggestion?

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • The validator error message is actually quite clear. It even says that the document is *not* in UTF-8, so I wonder why you came to assume that "it seems as if the page is still in UTF-8". What are you trying to achieve and why can't you use UTF-8? – Constantin Groß Oct 09 '19 at 17:38
  • 1
    Windows-1225 is a different encoding from UTF-8 – noɥʇʎԀʎzɐɹƆ Oct 09 '19 at 18:07
  • I was asked to demonstrate one page in Windows-1225 (my other pages on the site that I'm building are in UTF-8). So if it's in Windows-1225 why there are erros and what is their meaning? Also I tried to put in the text symbols that are supposed to be presented differently in Windows-1225 vs UTF-8 and it's shown like in UTF-8 (via chrome). Hope this clarifies my question, many thanks for the help! – just a rookie Oct 09 '19 at 22:45
  • The W3 validator is just being overly strict. It is perfectly valid to use a charset other than UTF-8, though there is no guarantee that all browsers will accept it. Other validators (like [this one](https://html5.validator.nu/) and [this one](https://www.freeformatter.com/html-validator.html)) flag the use of Windows-1255 as a warning rather than an error. You *should* use UTF-8 for greatest interoperability with all browsers. – Remy Lebeau Oct 09 '19 at 23:27
  • "*I tried to put in the text symbols that are supposed to be presented differently in Windows-1225 vs UTF-8 and it's shown like in UTF-8*" - that makes no sense. What symbols are you referring to? There is only 1 set of symbols, those defined by the Unicode standard. A charset is just a byte encoding for Unicode symbols. Properly encoded UTF-8 renders all of the same symbols (and more) that properly encoded Windows-1255 renders. – Remy Lebeau Oct 09 '19 at 23:31
  • 1) Which version/doctype of HTML are you using? That sets some limits and guides the validator. 2) meta declares an encoding; It doesn't make it so. 3) "one page in Windows-1225; other pages…are in UTF-8" makes for a pretty complicated web server setup if the web server sends charset in an HTTP Content-Type response header. Could that be wrong? If present, it overrides any other determination of character encoding. – Tom Blodget Oct 10 '19 at 01:40

0 Answers0