1

This is baffling me, and am trying to understand the reason why. For some reason when JavaScript is embedded in an HTML5 file it uses UTF-8. However, if I link to an external HTML5 it no longer uses UTF-8. Here is what I have discovered:

I have an index.html file which uses <!DOCTYPE html>. By default, because this is HTML 5 it defaults to UTF-8.

  1. WORKS: If I have unicode characters within my internal Javascript code then my Unicode characters show up fine.
  2. WORKS: If I have special unicode characters in the index.html document, and link to an external JavaScript file also containing unicode characters, the unicode characters show up fine.
  3. DOES NOT WORK: If I do NOT have special unicode characters in the index.html document, and link to an external JavaScript file containing unicode characters, the unicode characters do not show up show as expected. Instead they show up as some other cryptic characters.

So, Why does #2 work above, but not #3? I can get #3 to work by adding <meta charset="utf-8"/> to the top of the index.html file. However, I am confused as to why I need this line if having is supposed to default to UTF-8 anyhow.

If the reason is because the default UTF-8 only applies to that file and not external files, then why does it work for #2?

PS. I have tried creating these files in both TextEdit and Xcode (on MacOS). In Xcode at least the default character encoding does seem to be set to UTF-8.

kojow7
  • 10,308
  • 17
  • 80
  • 135
  • 1
    Do you have a `` tag in your HTML document head? The encoding is something that you have to explicitly manage if you want control. – Pointy Jan 28 '19 at 04:35
  • @Pointy Yes, that is stipulated in my question, unless something about what I said wasn't clear? – kojow7 Jan 28 '19 at 04:56
  • It wasn't clear because there is zero code in your question. You need the "charset" meta tag because that's how HTML5 is supposed to work. – Pointy Jan 28 '19 at 05:10
  • There are two things you need for an HTML5 document: the "html" DOCTYPE and the UTF-8 meta tag. – Pointy Jan 28 '19 at 05:13
  • That's part of my question. Why am I needing to put it if UTF-8 is the default encoding for HTML5. – kojow7 Jan 28 '19 at 05:37
  • 1
    [another SO question](https://stackoverflow.com/questions/12406066/does-html5-specify-a-default-character-encoding-for-html-documents-if-no-charact) – Pointy Jan 28 '19 at 05:44
  • @kojow7: browser which do not know about html5 can display most of the content of html5 pages. Now probably most of them know html5, but there are always transition periods. Additionally web serves can serve mixed contain (not all html5) so they could deliver a wrong/old encoding. It is just safer (but probably with time the exceptions will not matter anymore). – Giacomo Catenazzi Jan 28 '19 at 09:47
  • @GiacomoCatenazzi I am using the latest version of Chrome so according to the documentation it should default to UTF-8 by default. I'm also not using a web server at the moment, just plain text files (html, js) on the local machine. – kojow7 Jan 28 '19 at 15:58
  • Make sure your JavaScript file is also utf-8 encoded. – Musa Jan 28 '19 at 17:57

0 Answers0