0

I have an "old" database (in utf 8) where I read and write on by using JDBC. Now, I must be able to also store emoji into a column of my table. I have changed the charset of involved columns to utf8mb4:

ALTER TABLE
myTable
CHANGE column_name column_name
longtext
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci
NOT NULL;

However, when I try to insert an emoticon into that column, I get the famous error

java.sql.SQLException: Incorrect string value: '\xF0\x9F\x91\x8D\xF0\x9F...'

Should I convert entire database, or am I doing something wrong?

Gord Thompson
  • 116,920
  • 32
  • 215
  • 418
giozh
  • 9,868
  • 30
  • 102
  • 183
  • Which version of MySQL Connector/J do you use, and what connection properties do you use? Have you set the MySQL server property character_set_server to utf8mb4 or not? – Mark Rotteveel Aug 16 '19 at 16:20
  • Please provide `SHOW CREATE TABLE myTable` -- I won't to verify that the column is declared correctly. – Rick James Aug 17 '19 at 05:52

1 Answers1

2

Need to connect with utf8mb4 to get , etc.

?useUnicode=yes&characterEncoding=UTF-8 in the getConnection() call.

As a fallback, execute SET NAMES utf8mb4 after connecting. (See Comment.)

"For Connector/J 8.0.12 and earlier: In order to use the utf8mb4 character set for the connection, the server MUST be configured with character_set_server=utf8mb4; if that is not the case, when UTF-8 is used for characterEncoding in the connection string, it will map to the MySQL character set name utf8, which is an alias for utf8mb3.'

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • The [MySQL Connector/J documentation](https://dev.mysql.com/doc/connector-j/8.0/en/connector-j-reference-charsets.html) explicitly says to not use `SET NAMES`: _"Warning: Do not issue the query SET NAMES with Connector/J, as the driver will not detect that the character set has been changed by the query, and will continue to use the character set configured when the connection was first set up."_ – Mark Rotteveel Aug 17 '19 at 05:46
  • Confirmed with MySQL Connector/J version 8.0.17. All that's required is `useUnicode=true&characterEncoding=utf8` in the connection URL. That was true even when the server's default character set was `latin1` and the database's default character set was `utf8` (not `utf8mb4`). – Gord Thompson Aug 17 '19 at 11:09