4

I have a pretty good understanding of the Content-Type header for most cases. I understand that for the following four examples, you would normally follow the MIME-type with charset=your-charset-here.

Content-Type "text/plain; charset=utf-8"
Content-Type "text/html; charset=utf-8"
Content-Type "text/javascript; charset=utf-8"
Content-Type "text/xml; charset=utf-8"

... and with images, no charset:

Content-Type "image/gif"
Content-Type "image/x-icon"
etc.

But what about these two? Should they or shouldn't they include the charset?

Content-Type "application/x-javascript"
Content-Type "application/xml"

I realize it's okay if they don't include the charset, but I would like to include it, if it's possible. They are just text-based files, after all.

sysadmin1138
  • 133,124
  • 18
  • 176
  • 300
Jeff
  • 1,416
  • 3
  • 28
  • 50

1 Answers1

4
Content-Type "text/xml; charset=utf-8"

This is redundant. For XML, the <?xml?> declaration takes precedence over the Content-Type header. If the XML Declaration is omitted you've got UTF-8 anyway.

I would normally leave the charset out for XML. Given that XML has its own perfectly good inline character encoding mechanism, the Content-Type header is unneeded and can only get in the way by accidentally choosing the wrong type for files without an encoding specified that are treated as UTF-8 everywhere else.

The one time you do need a charset parameter for XML is when you're serving a non-ASCII-compatible character set, usually UTF-16, where otherwise the parser wouldn't get as far as reading <?xml. But it's pretty rare you'd ever want to do that. UTF-16 isn't a great file storage/over-the-wire format.

Content-Type "application/xml"

The application/xml media type is specified by RFC3023, and a charset parameter has been explicitly defined for it. So you can use charset if you want (though as per the above, I generally don't want).

Content-Type "application/x-javascript"

Is an unofficial type so there is no specification to say whether a charset parameter exists or what it might do. This type should probably be avoided in favour of text/javascript (traditional) or application/javascript (defined by RFC4329).

In practice, setting a charset on your JavaScript resources isn't necessarily enough, as IE completely ignores it.

Summary of the precedence (highest to lowest) given to scripting character set mechanisms:

  • IE: <script charset> attribute, charset of parent page

  • Opera: charset of script file, charset of parent page

  • Mozilla, Webkit: charset of script file, <script charset> attribute, charset of parent page

bobince
  • 811
  • 4
  • 8
  • the more definitive answer, +1 – meder omuraliev Dec 14 '09 at 03:44
  • @bobince, Thanks once again. In case anybody is wondering, Google uses `text/javascript; charset=utf-8`, the WordPress devs use `application/x-javascript; charset=utf-8` for the admin back-end, and most other sites just use `application/x-javascript` or `text/javascript` without any charset defined in the headers. – Jeff Dec 14 '09 at 14:09