2

I've set Apache 2.4 server to AddDefaultCharset utf-8 in httpd.conf and my .htaccess file redirects all non-www and http to https://www.example.com

RewriteEngine On
RewriteCond %{HTTP_HOST} ^example\.com$ [OR]
RewriteCond %{HTTPS} !on
RewriteRule ^(.*)$ https//www.example.com/$1 [R=301,L,NE]

If I look at HTTP response header, only traffic sent to https://www.example.com generates a UTF-8 response. The non-www and http traffic respond with ISO-8859-1 charset.

Anyone know how to ensure all URL-redirect HTTP responses are in UTF-8?

user46688
  • 176
  • 1
  • 12

2 Answers2

3

This would seem to be default behaviour, since the Apache redirect response is encoded as ISO-8859-1. However, Apache does allow you to suppress the charset parameter in the response by setting the suppress-error-charset environment variable:

# set desired env variable to suppress iso-8859-1 charset
SetEnvIf Host ^ suppress-error-charset

However, it cannot be changed to a different charset.

Reference:

MrWhite
  • 12,647
  • 4
  • 29
  • 41
  • Thanks @w3dk, the 2nd link states `Sending error pages without a specified character set may allow a cross-site-scripting attack for existing browsers (MSIE) which do not follow the HTTP/1.1 specification and attempt to "guess" the character set from the content. Such browsers can be easily fooled into using the UTF-7 character set, and UTF-7 content from input data (such as the request-URI) will not be escaped by the usual escaping mechanisms designed to prevent cross-site-scripting attacks.` Is my interpretation correct that using `suppress-error-charset` thus opens a security vulnerability? – user46688 Jan 09 '17 at 00:53
  • 1
    It would seem so. However, that doesn't necessarily mean your site is vulnerable. Whether an Apache "redirect" would be vulnerable to I don't know - it's hard to imagine, since the HTML response should never normally be processed by the client. For "custom" error documents you can set the `charset` parameter yourself (to UTF-8). – MrWhite Jan 09 '17 at 01:13
2

I found a way to change the charset instead of removing it:

Header always edit Content-Type 'iso-8859-1' 'utf-8'

This will apply to all request but if you’re not using ISO-8859-1 that’s not really a problem.