0

I'm migrating old mysql database to new one so I wrote a script that connects to old database, reads it and creates new entities in new database. Here it is:

old_db = Mysql2::Client.new(host: options[:db_host],
                            username: options[:db_user],
                            password: options[:db_password],
                            database: options[:db_name],
                            encoding: 'utf8')
old_categories = old_db.query('select id, title from catalog__category order by lvl asc')
old_categories.each do |old_c|
  c = Catalog::Category.new
  c.name = old_c["title"]
  c.save!
end

However after migration categories names appeared in really bad shape. Both databases encoded in utf8. Client and server sets utf8 encoding

mysql> show variables like "%character%";
+--------------------------+------------------------------------------------------+
| Variable_name            | Value                                                |
+--------------------------+------------------------------------------------------+
| character_set_client     | utf8                                                 |
| character_set_connection | utf8                                                 |
| character_set_database   | utf8                                                 |
| character_set_filesystem | binary                                               |
| character_set_results    | utf8                                                 |
| character_set_server     | utf8                                                 |
| character_set_system     | utf8                                                 |
| character_sets_dir       | /usr/local/Cellar/mysql/5.5.27/share/mysql/charsets/ |

PHP project uses old database and shows all strings correctly, Rails project uses new database and shows correctly everything, but imported categories strings.

Does anyone knows where is the problem and how to fix it?

Thank you.

Pavel S
  • 389
  • 1
  • 9
  • It appears that PHP project (Doctrine 2) stores strings in some wierd format. Problem is still here because I don't know how to convert it to normal utf8 strings – Pavel S Nov 18 '12 at 17:51

1 Answers1

0

The root problem with bad encoding in database itself (it was changed from latin1 to utf8 on the fly some time ago). As a result strings was encoded twice in dump.

mysqldump --default-character-set=latin1 --skip-set-charset -u <user> -p > b.sql

This command help to generate correct dump

Pavel S
  • 389
  • 1
  • 9