3

I am inserting into a mysql database. I get the following error when trying to do the insert

Incorrect string value: '\xF0\x9F\x87\xB7\xF0\x9F...' for column 'field_4' at row 1

I thought I had figured out this error by simply changing the column encoding the to utf8mb4 and had tested but recently this error appeared again. I am using php to parse the string and run the following function before inserting...

function strip_emoji($subject) {
    if (is_array($subject)) {
        // Recursive strip for multidimensional array
        foreach ($subject as &$value) $value = $this->strip_emoji($value);
        return $subject;
    } else {
        // Match Emoticons
        $regexEmoticons = '/[\x{1F600}-\x{1F64F}]/u';
        $clean_text = preg_replace($regexEmoticons, '', $subject);

        // Match Miscellaneous Symbols and Pictographs
        $regexSymbols = '/[\x{1F300}-\x{1F5FF}]/u';
        $clean_text = preg_replace($regexSymbols, '', $clean_text);

        // Match Transport And Map Symbols
        $regexTransport = '/[\x{1F680}-\x{1F6FF}]/u';
        $clean_text = preg_replace($regexTransport, '', $clean_text);
        return 
}

There are several similar questions to this but I still have these errors. Any further advice on how to prevent this error? I realize that it is an emoji unicode character / sprite but not sure how to deal with it.

eric MC
  • 766
  • 2
  • 10
  • 36

2 Answers2

1

You are trying to insert a character that spans 4 bytes, so you have to convert the column to the utf8mb4 character set.

The utf8 character set is limited to characters that span 3 bytes (the Unicode characters U+0000 through U+FFFF).

Joni
  • 108,737
  • 14
  • 143
  • 193
0

Do you have utf8 charset for the connection as well?

Adding ";charset=utf8" in the PDO-connection string, or executing the query "set names utf8".

becquerel
  • 1,131
  • 7
  • 11
  • I'm using the codeigniter framework (doesn't really matter, just an FYI) and I currently have both the charset and db charset set to UTF8 with a collation of utf8_general_ci – eric MC Oct 07 '13 at 14:21