1

I have three database environments, MAMP, testing server, actual server. Testing server is just an old site that doesn't use its database.

On both MAMP and testing server the accented letters I need to display are displaying fine(i.e. font not the problem) í á ñ , however on the actual server I'm getting those lovely black question mark squares ����.

So I guessed it was the charset and looked into it but all DB's have the same charset latin1_swedish_ci.

I checked the settings using the method described in this post and then ran SET NAMES 'utf-8'; in the SQL but the problem persists.

I also changed the charset on the database and on all of the individual tables but this seems to have no effect.

The only way I have been able to affect the way the browser is interpreting the text it is to change the META to latin1, actually I'm using ini_set('default_charset', 'utf-8');. The problem here is that it replaces the � with all sorts of random symbols.

Can someone please help me identify the cause? I'm out of ideas.

I'm using SublimeText2, could this be the problem? I've tried saving it with UTF encoding and this hasn't had much effect either.

Community
  • 1
  • 1
Adam Brown
  • 2,812
  • 4
  • 28
  • 39

1 Answers1

2

If you see the � UNICODE REPLACEMENT CHARACTER, that means the data is interpreted as UTF-8 (or another Unicode encoding) but is not actually valid UTF-8. Override your browser to interpret the data in some other encoding (View menu → Encoding in most browsers) to figure out what the data is actually encoded in. Once you have figured that out either:

  1. Change the encoding of the data to match the declared encoding; do this by setting the database connection encoding (mysql_set_charset, SET NAMES ... or similar, depending on your API). The collation and encoding of the individual columns is irrelevant since MySQL converts encodings on the fly to the connection encoding.
  2. Change the declared encoding by setting the correct Content-Type HTTP header and/or <meta> tag; ini_set('default_charset', ...) will do that, but the web server may override it. Inspect the actual HTTP headers using browser tools.

First make sure the data in your database is actually okay and is not already garbage. If you need more information, see What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text and Handling Unicode Front To Back In A Web App.

deceze
  • 510,633
  • 85
  • 743
  • 889
  • Thanks for the answer. I'm going to accept it even though I fixed the problem late Friday becuase its interesting reading. I fixed it by re-encoding the files as Window 1252 and setting the charset to latin1. Changing it to latin1 was only effecting the text ON the page so I changed the format of the page. Thanks again – Adam Brown Jul 08 '13 at 18:09