Edit: Since you want to keep HTML characters, and I now think your source string isn't quite what you posted (I think it is actual unicode, rather than containing \unnnn
as a string), I think your best option is this:
$html = str_replace( str_replace( str_replace( htmlentities( $whatever ), '<', '<' ), '>', '>' ), '&', '&' );
(note: no call to utf8-decode
)
Original answer:
There is no direct conversion. First, decode it again:
$decoded = utf8_decode( $whatever );
then encode as HTML:
$html = htmlentities( $decoded );
and of course you can do it without a variable:
$html = htmlentities( utf8_decode( $whatever ) );
http://php.net/manual/en/function.utf8-decode.php
http://php.net/manual/en/function.htmlentities.php
To do this by regular expression (not recommended, likely slower, less reliable), you can use the fact that HTML supports &#xnnnn;
constructs, where the nnnn
is the same as your existing \unnnn
values. So you can say:
$html = preg_replace( '/\\\\u([0-9a-f]{4})/i', '&#x$1;', $whatever )