1

On my server, the database is encode in utf8mb4_unicode_ci I'm writing an API to serve data in JSON. The PHP function json_encode only accepts utf8.

I'm not able to build the full chain:

strings encoded in utf8mb4_unicode_ci => utf8 => json => API => JavaScript => strings encoded in utf8mb4_unicode_ci

For example, $str = "Linéaire ";

From utf8mb4_unicode_ci to utf8 I already try the PHP functions utf8_encode(str) and mb_convert_encoding($str, 'UTF-8', 'Windows-1252') that return respectively:

  • "Lin\u00c3\u00a9aire \u00f0\u009f\u0098\u0080"
  • "Lin\u00c3\u00a9aire \u00f0\u0178\u02dc\u20ac"

Both functions do not return the same result. I don't know which one to choose. Furthermore, I don't know how to unescape the string on client side in JavaScript.

Fifi
  • 3,360
  • 2
  • 27
  • 53
  • The `utf8mb4_unicode_ci` is an internal Mysql UTF8 encoding variant. It has nothing to do with the client interaction. Since you land that `SET NAMES 'utf8'` query, you are getting the UTF8 data from the DB. No additional manipulations required. Just `print json_encode($sth->fetchAll());` and you'r fine. This is what UTF8 was created for after all - forget about encoding conversions. And you don't need to unescape something at client side, just `let data = JSON.parse(raw_json);`. – Jared Jan 06 '23 at 13:23
  • @Jared If I do not convert data in `utf8`, I can't use `json_encode()`. The function returns the `JSON_ERROR_UTF8` error. – Fifi Jan 06 '23 at 14:06
  • After connecting to DBMS run the `SET NAMES 'utf8'` query and DBMS will return the UTF8 data for the subsequent queries. – Jared Jan 06 '23 at 14:50

0 Answers0