0

I have connected to a MySQL database from R and ran the "set names utf8" directive.

library(RMySQL)
con<-dbConnect(MySQL(), user='john', password='secret', dbname='awesome', host='myhost')
dbSendQuery(con, "SET NAMES UTF8; ")

This usually does the trick when I want UTF8 data back from my queries. However a quick check shows that not all variables have been set correctly

dbGetQuery(con, "SHOW VARIABLES LIKE 'character_set_%'")
#             Variable_name                      Value
#1     character_set_client                     latin1
#2 character_set_connection                     latin1
#3   character_set_database                       utf8
#4 character_set_filesystem                     binary
#5    character_set_results                     latin1
#6     character_set_server                     latin1
#7     character_set_system                       utf8

Here it's obvious that the character_set_client variable has not been set. It doesn't help to set it explicitly either. In fact it seems like it completely ignores it.

dbSendQuery(con, "SET character_set_client=utf8; ")
dbGetQuery(con, "SHOW VARIABLES LIKE 'character_set_%'")
#             Variable_name                      Value
#1     character_set_client                     latin1
#2 character_set_connection                     latin1
#3   character_set_database                       utf8
#4 character_set_filesystem                     binary
#5    character_set_results                     latin1
#6     character_set_server                     latin1
#7     character_set_system                       utf8

How can I set this and make the connection return UTF8 encoded text? The schema is using the utf8 encoding and collation.

Dr. Mike
  • 2,451
  • 4
  • 24
  • 36

1 Answers1

1

There's a handshake that happens, and the client ends up telling the MySQL to use latin1. It is possible to disable the handshake, or on initial connection SET NAMES utf8.

See: http://dev.mysql.com/doc/refman/5.6/en/server-options.html#option_mysqld_character-set-client-handshake

Morgan Tocker
  • 3,370
  • 25
  • 36
  • Thanks for your answer and the clarification of why this happens. However, as you can see I am issuing 'SET NAMES utf8' on initial connection. But that does not help... Can I disable the handshake from R? – Dr. Mike Jul 08 '14 at 11:06
  • Disabling the handshake is a server option: `skip-character-set-client-handshake` – Morgan Tocker Jul 08 '14 at 15:35