-1

I ran

SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'; and the output is

+--------------------------+-------------------+
| Variable_name            | Value             |
+--------------------------+-------------------+
| character_set_client     | utf8              |
| character_set_connection | utf8              |
| character_set_database   | latin1            |
| character_set_filesystem | binary            |
| character_set_results    | utf8              |
| character_set_server     | latin1            |
| character_set_system     | utf8              |
| collation_connection     | utf8_general_ci   |
| collation_database       | latin1_swedish_ci |
| collation_server         | latin1_swedish_ci |
+--------------------------+-------------------+

Can someone please explain the exact usage of these settings? My understanding is following:

  1. character_set_client: Tells the encoding used by client for query encoding
  2. character_set_connection: Encoding used by server to convert query into
  3. character_set_database: encoding used for storing data in tables
  4. character_set_server: Default encoding if character_set_connection not specified
  5. character_set_results: Results are encoded in this format and returned.

Why are there are so many configs required? Couldn't client and server config be kept same always?

I am trying to perform an insert query which fails due to presence of ' in string. Is it recommended to escape it or encode the query? Also, how can I encode it in golang?

Volker
  • 40,468
  • 7
  • 81
  • 87
uzumas
  • 632
  • 1
  • 8
  • 23

1 Answers1

1

Yes, you must escape any strings being sent into MySQL. In particular at least, ', ", and \ must be escaped. Otherwise, do you think will happen when inserting the name O'Brian into a table thus?:

 INSERT INTO t (name) VALUES ('O'Brian');

Quoting and escaping are independent of character_set% values.

A column can have any desired CHARACTER SET.
A table can have any desired default CHARACTER SET.
A database can have any desired default CHARACTER SET. Note: This setting is useless if you are explicit when CREATEing columns or tables.
The server can have any desired default CHARACTER SET. Note: This setting is useless if you are explicit when CREATEing databases/tables/columns.

Meanwhile, the client can have any different CHARACTER SET. But the way to tell the server what encoding the client has is via 3 of those settings (client/connection/results). The can be set en masse via

SET NAMES utf8mb4;

However, it is usually "better" to establish the client's encoding via some parameter during the connection process. (The syntax for such varies from client to client -- Java/PHP/VB/etc.)

Why so many? It is probably too many. Virtually all practical cases are handled by changing that trio of settings; meanwhile, the other settings can be ignored.

One more thing: Stored routines have a charset and collation, and I think that comes from the database setting.

Rick James
  • 135,179
  • 13
  • 127
  • 222