1

<meta charset="UTF-8">

UTF-8 is default encoder in modern browsers. So all this code does is it adds support for browsers which don't automatically do this. I don't plan on supporting older browsers is there any other reason to add this line of code?

I have heard others saying that leaving it out could lead to some cross scripting attacks, bad things and such but never gave me any clear examples.

Also some old HTMl validator throws error when leaving <meta charset="UTF-8">. out

https://validator.w3.org/nu/#file

The character encoding was not declared

Then it does this

process with windows-1252. 

This isn't great cause this could lead to error if the site has characters that windows-1252 doesn't support.

I'm guessing this only happens on browers that don't default to UTF-8 support this though. Should I be worried about this warning/error if I leave this out.

I have researched, about trying to understand why UTF-8 is used but I can't find a definite answer on to why to use or to not use it.

Thanks in advance.

Porter
  • 21
  • 5
  • 1
    Not sure if this can still happen but in the past it might have been possible to trick browser to switch to a different encoding, even if there's no XSS errors and the user can inject something that looks like this meta header in the first x bytes. – Evert Dec 05 '22 at 03:10
  • "I'm guessing this only happens on browers that don't default to UTF-8 support this though". Well, you're not wrong. :-) The problem is, no browser defaults to UTF-8. – JimmiTh Jan 19 '23 at 21:43

1 Answers1

1

For anyone looking back on this post, I did some more research, and found more reasons on why to include this line of code. I found this page from the Google Devs. That states this

https://web.dev/charset/#resources

Lighthouse flags pages that do not specify their character encoding:

So, not including <meta charset="UTF-8"> would make your Google Lighthouse score lower.

Here is why it is considered a best practice

Servers and browsers communicate with each other by sending bytes of data over the internet. If the server doesn't specify which character encoding format it's using when it sends an HTML file, the browser won't know what character each byte represents. The character encoding declaration specification solves this problem.

Here is the lighthouse doc

From my understanding this line of code isn't necessary anymore, but is considered a best practice. Hope this helped for anyone reading this.

Porter
  • 21
  • 5
  • It's not just best practice. It's absolutely necessary if your page uses non-ASCII characters and your server doesn't include character encoding in the Content-Type header. Which most servers don't by default. No browsers assume UTF-8. They'll infer from the content, and in many cases - particularly in this day and age, where content is generated from Javascript - they'll only see ASCII, and hence won't be able to. In those cases, they'll tend to assume Windows-1252. And be wrong. – JimmiTh Jan 19 '23 at 21:03