8

After reading all the topics about emoji on SO I'm bounded to ask for some help. Question seems to be almost the same: I got an iPhone App sending via PHP emoji to MySQL DB, I can see some symbol on records, as example "umbrella" and "cloud" but other not (angry face, smiling face, and other).

Why some yes and other not?

  • MySQL Collation: utf8mb_unicode_ci
  • Table collation: utf8mb_unicode_ci
  • Field, varchar, collation: utf8mb_unicode_ci

PHP setup:

    mysql_query("SET CHARACTER SET utf8mb4");
    mysql_query("SET NAMES utf8mb4");

The symbols not showed inside the record are shown as question mark "?"

Mathias Bynens
  • 144,855
  • 52
  • 216
  • 248
Fabrizio
  • 514
  • 7
  • 18

1 Answers1

11

Some emoji are encoded using 3 bytes. If your computers supports emoji, here are the 3 byte emoji:

☺❤✨❕❔✊✌✋☝☀☔☁⛄⚡☎➿✂⚽⚾⛳♠♥♣♦〽☕⛪⛺⛲⛵✈⛽⚠♨1⃣2⃣3⃣4⃣5⃣6⃣7⃣8⃣9⃣0⃣#⃣⬆⬇⬅➡↗↖↘↙◀▶⏪⏩♿㊙㊗✳✴♈♉♊♋♌♍♎♏♐♑♒♓⛎⭕❌©®™

The rest are encoded using 4 bytes and will not work unless you update mysql to utf8mb4. It sounds like you did not fully upgrade to utf8mb4 in some way.

Jake
  • 1,135
  • 1
  • 12
  • 26
  • Thank you very much Jake, I will search in your direction. Right now I'm pretty sure I converted the target fields and the table in the right collation but I must missed something. I will update this topic. Thanx – Fabrizio May 02 '12 at 19:41
  • I've verified and you're right the 3 bytes emoji are represented ok. What I miss is where I need to change collation. The field where I must record the emoji has got utf8mb4_unicode_ci collation. The Table that contains that field has got utf8mb4_unicode_ci collation. In the General Settings (phpMyAdmin) I see that MySQL collation is utf8_general_ci and when I try to change to utf8mb4_unicode_ci it seems to automatically back to the previous setting. I don't know if the problem is for this reason. I will keep searching. – Fabrizio May 03 '12 at 12:20
  • 3
    @Fabrizio I’ve written [a detailed guide on how to upgrade from `utf8` to `utf8mb4`](http://mathiasbynens.be/notes/mysql-utf8mb4) — perhaps it helps you. – Mathias Bynens Aug 07 '12 at 06:53
  • What is the best version of 'utf8mb4' to use? In MySQL, you have many options for utf8mb4, such as `utf8mb4_bin`, `utf8mb4_general_ci`, `utf8mb4_unicode_ci`, `utf8mb4_roman_ci`, and many others. Any thoughts/ideas/suggestions on the differences in these that would help a developer decide which is best to pick for storing emojis in a MySQL database? – skcin7 Feb 19 '16 at 22:33
  • I'm just going to pick `utf8mb4_general_ci` for now, but I don't know any of the differences, so if there's a better one for me to pick, please provide insight. – skcin7 Feb 19 '16 at 22:34
  • The difference is in how mysql will sort the data. If you care that e and é and ê are sorted absolutely correctly, then use the slower utf8mb4_unicode_ci, but if it doesn't matter, use the faster utf8mb4_general_ci – Jake Feb 22 '16 at 17:23