2

I'm using MySql version 5.6.46 on Linux. I have a column name varchar(50) COLLATE utf8mb4_bin and ENGINE=InnoDB DEFAULT CHARSET=utf8mb4.

When I tried to insert some data into the table. I find that,

for a (1 byte in utf8), it can store 50 maximum.

for (3 bytes in utf8, Chinese character love), it can store 50 maximum.

for (4 bytes in utf8, hex F09F9881), it can store 25 maximum.

This confuses me. Why Mysql is not treating one emoji as one character? If Mysql does the byte-count limit and improperly uses 3-byte-per-character, I'm expecting it can store 50*3/4=37. How on earth Mysql do the restriction?

----------UPDATE-------------

Thanks to your response, I figure it out. I am on MacOS X and I was using Sequel Pro 1.1.2. When I edit table content in the UI of the software, the maximum is 25 emoji and it toasts warning maximum text length is set to 50

Then I tried the raw hex approach on the server set name = X'F09F9881...F09F9881' and it can hold 50 emojis perfectly.

So this is a Sequel Pro issue. I will add Sequel Pro tag to this question. Hope this will help people who met the same issue. Thanks

matrix
  • 349
  • 3
  • 12
  • 2
    It's possibly a bug/limitation because it works correctly starting on [MySQL/8](https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=caee3b14e9ac7e09c057bd769e2756c7). – Álvaro González Jan 28 '21 at 10:04
  • Allows 50 but not 51 on MySQL5.7.28 – RiggsFolly Jan 28 '21 at 10:08
  • @RiggsFolly Funny. Not on [dbfiddle.uk](https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=caee3b14e9ac7e09c057bd769e2756c7) :-? – Álvaro González Jan 28 '21 at 10:18
  • @ÁlvaroGonzález I am on Windows, could that be making a difference? Just tried it on MySQL 5.6.46 and I can store 50 but not 51 on there also – RiggsFolly Jan 28 '21 at 10:22
  • 1
    https://dbfiddle.uk/?rdbms=mysql_8.0&rdbms2=mysql_5.6&fiddle=fe0a240e6982e95b3f6aa1dc8f12128b - the difference is obvious. May compare with 5.7... or MariaDB. – Akina Jan 28 '21 at 10:27
  • There is something wrong with dbfiddle.uk - `char_length()` returns a wrong result. Even for latin (`char_length('é')`). Nothing like that on [dg-fiddle.com](https://www.db-fiddle.com/f/crcLPPc8TJa3NedBCSV9Wh/0) – Paul Spiegel Jan 28 '21 at 10:42
  • @matrix, Please post your system specs (including OS with exact version) and post a result of `CHAR_LENGTH`. – Paul Spiegel Jan 28 '21 at 10:44
  • Seems like the same versions of MySQL on Linux and Windows give very different results ?? – RiggsFolly Jan 28 '21 at 10:57
  • Nope. Works for me on debian and windows the same. MySQL aswell as MariaDB. Only that doesn't work is dbfiddle.uk. Didn't test any 5.6 though. – Paul Spiegel Jan 28 '21 at 11:06

1 Answers1

1

No arithmetic needed.

varchar(50)

Holds 50 characters of any type. This will occupy up to 202 bytes (4 * 50 + 2 for a hidden length field).

To debug your situation, please provide:

SELECT VARIABLES LIKE 'char%';

SELECT col, HEX(col) FROM ...  -- to show what was stored.
Rick James
  • 135,179
  • 13
  • 127
  • 222
  • @matrix - `ucs2` / `utf16` woiuld take 2 bytes per emoji. Check the settings in sequel pro. `utf-8`/`utf8mb4` is the only reasonable charset to use these days. – Rick James Jan 29 '21 at 17:32