3

I am inserting strings into my database but getting the MySQL 1366 error for invalid string byte sequences.

2016/11/04 13:33:40 Error 1366: Incorrect string value: '\x89PNG\x0D\x0A...' for column 'text' at row 1
2016/11/04 13:33:56 Error 1366: Incorrect string value: '\xB6\xEB\xE4\x0B\x92\xEE...' for column 'text' at row 1
2016/11/04 13:33:56 Error 1366: Incorrect string value: '\xFF\xD8\xFF\xE0\x00\x10...' for column 'text' at row 1
2016/11/04 13:34:35 Error 1366: Incorrect string value: '\x9C]\x91\xD1k\xC2...' for column 'text' at row 1

My MySQL config is set for utf8mb4 as shown below:

mysql> SHOW VARIABLES LIKE 'character_set%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8mb4                    |
| character_set_connection | utf8mb4                    |
| character_set_database   | utf8mb4                    |
| character_set_filesystem | binary                     |
| character_set_results    | utf8mb4                    |
| character_set_server     | utf8mb4                    |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

My database connection pool looks like this:

db, err = sql.Open("mysql", config.User+":"+config.Password+"@tcp("+config.Host+")/"+config.Database)
if err != nil {
    log.Fatal(err)
}

db.Exec("SET NAMES 'utf8mb4'; SET CHARACTER SET utf8mb4;")

What am I still missing?

Xeoncross
  • 55,620
  • 80
  • 262
  • 364
  • I had the same issue and it turned out that, somehow, control characters were being introduced into the string (in my particular case, when copying _formatted_ text from MS-Word into an input field of my Web page). In my case, the solution was to filter out control characters at the client side. – FDavidov Nov 04 '16 at 17:53

2 Answers2

7

Those are not valid UTF-8 strings; those are binary data (the first is a PNG file!). You'll need to store them in a real binary column, since MySQL does do UTF-8-specific operations like case folding and language collation. (Go does not enforce UTF-8 encoding on strings, so Go doesn't complain. Go only uses UTF-8 to encode string literals, but the \x escape sequence overrides this. And of course, range, []rune conversion, and various packages assume strings are UTF-8.)

You can check if a string is a valid sequence with utf8.ValidString().

Xeoncross
  • 55,620
  • 80
  • 262
  • 364
andlabs
  • 11,290
  • 1
  • 31
  • 52
2

Use a BLOB (maybe MEDIUMBLOB) datatype for the column containing images. Using TEXT leads to checking the encoding. A PNG does not contain correctly encoded utf8 characters, hence the errors.

The rest of your usage of utf8mb4 is probably fine.

Rick James
  • 135,179
  • 13
  • 127
  • 222