3

In C++, it's possible create a UTF-8 string using this kind of notation: "\uD840\uDC50".

However this doesn't work in PHP. Is there a similar notation?

If not, is there any built-in way to create a UTF-8 string knowing its Unicode code point?

laurent
  • 88,262
  • 77
  • 290
  • 428

2 Answers2

10

I've ended up implementing it like this:

$utf8 = html_entity_decode("一", ENT_COMPAT, 'UTF-8');
laurent
  • 88,262
  • 77
  • 290
  • 428
  • use ENT_QUOTES | ENT_COMPAT to convert quotes as well – E Ciotti Mar 26 '16 at 21:23
  • This has limitations and will not work with all UTF-8 chars, as not all hex chars are suported in HTML standard. See https://www.ascii.cl/htmlcodes.htm ("not defined in HTML 4 standard") – digitaldonkey Mar 20 '18 at 10:27
2
function hexToString($str){return chr(hexdec(substr($str, 2)));}
$result = preg_replace_callback("/(\\\\x..)/isU", function($m) { return hexToString($m[0] ); }, $str);
Jiri Zachar
  • 487
  • 3
  • 8