0

I have a database in latin1 format, all the utf8 character stored are shown as ????

 +------+---------+-------+---------+--------------------+----------+--------------------      -----+---------------------+---------------------+---------+
 | id   | user_id | fname | lname   | designation        | location | email                    | created_at          | updated_at          | country |
 +------+---------+-------+---------+--------------------+----------+------------------------- +---------------------+---------------------+---------+
 | 6035 |    6035 | ????? | ??????? | ???????? ????????? |          |  ccc@rddd.net            | 2011-04-11 06:05:54 | 2011-04-10 06:13:04 | xxxxxxxxx |
 +------+---------+-------+---------+--------------------+----------+-------------------------+---------------------+---------------------+---------+

Now I use this command and change the format of the database and the table to utf8

  ALTER TABLE <table_name> CONVERT TO CHARACTER SET utf8;

  ALTER DATABASE <database_name> CHARACTER SET utf8;

I have read that latin1 uses 1byte for every character but utf8 uses 3bytes for every character. My question is If i alter my table (Already containing lots of data) form latin1 to utf8, what will the old character data consume 3bytes or 1byte. If i use alter and convert the data will i have problem with the old data ? I am sure that new data will be in utf8.

Abhay Kumar
  • 1,582
  • 1
  • 19
  • 45

1 Answers1

0

first, you should try :

SET NAMES 'utf8'
SET CHARACTER SET utf8

and SELECT your row #6085 in order to verify if data recorded are not corrupted and encoded in UTF8 format.

UTF8 (unlike UTF16), in order to be backward compatible, uses 1 byte for ASCII characters. It uses up to 4 bytes for other characters (unicode faq).

You should not convert your data if they are already stored in UTF8 format.


Warning

  1. Try your ALTER TABLE on a backup.
  2. ALTER TABLE locks your database.
Guillaume USE
  • 436
  • 1
  • 4
  • 6
  • The DB is readable while alter table is running but update and writes are not possible. I tried set names and character but the data is sill displayed as ??? – Abhay Kumar Aug 08 '12 at 12:30
  • So, I think your data can not be convert to utf8. Because they were inserted by a latin1 connector on a latin1 structure. – Guillaume USE Aug 08 '12 at 12:37
  • when i use CONVERT TO CHARACTER SET and Modify commands it converts them and I am not able to see any errors and it works fine. Though the old data non valid utf8 data is not recovered – Abhay Kumar Aug 09 '12 at 04:30