0

I am trying to change the character set from latin1 to utf8

Problem: The passwords are not working for French characters. The password works for special characters(like quotes, brackets, dollar sign, etc.). If I convert the character set in the code part back to latin1, I can login using French characters, but not with utf8

What have I done so far:

  • Changed the character set of the database; I can see all column types are showing as utf8. I ran the query both at the database and the table level.
  • Changed the character set for the code part to utf8.
  • My testing shows all is cool, I can see accented French characters fine, and nothing seems broken. It is only for Passwords that are giving me issues.

Please suggest:

  • Do I need to change the data itself to utf8 as well?
  • I ran alter table command, and it changed the column character set to utf8, am I missing something here?

I am suspecting this may be the cause because the passwords are working fine if I convert the code part to latin1. So I am thinking as the code and the database were latin1, so it can recognize the special characters, but when I change it to utf8 it cannot interpret the special French letters as those were initially stored as latin1.

Both PHP and MySQL version are latest.

Since my response was long, I decided to add it here:

The hashing functions are very complex, it is using a combination of md5,encode64,and crypt function. I have noticed the resultant pwd is different for latin and unicode. That is the reason, I was suspecting that previously generated pwd using latin1 can match the pwd, and not unicode after conversion. Again, it is only happening for French letters, and not the ascii range for 0 to 127. I am not sure how to handle this situation where the existing users can successfully login, with the char set changed to unicode-8. I can't use iconv(), as there is no way I can distinguish whether the passwords are created using latin1 or unicode8. Do I need to change the data too in addition to changing the database, and how ? If I am thinking right, then the data conversion to unicode8 may take care of French characters as well?

Nasir
  • 41
  • 4
  • Can it be that your hashing function generates different hashes from latin1 and utf8 character sets? So you should probably rehash all the passwords. If that's even possible. – Lauri Elias Nov 22 '13 at 22:28
  • Update the code and database charset? – Anthony Nov 22 '13 at 22:29
  • If your current hashes are of the Latin-1 (probably actually Windows code page 1252) representations of a password, then yes you would have to convert to Latin-1 before matching, and that means you would never be able to have a password with a non-Latin-1 character in. In this case you should detect an attempt to use one when the password is set, and give a suitable error message, instead of making an unusable account. If this approach isn't acceptable you're going to have to do a password migration process (probably a good idea anyway so you can move to bcrypt et al) – bobince Nov 25 '13 at 13:02

1 Answers1

-1

if you need to convert char from some Unicode to another you can use this function

iconv()