3

I'm currently using TinyMCE as html editor for users of my CMS. Somehow the euro symbol (€) is converted to %u20AC by IE (any).

After a short search I found this. It gives a lot for different encodings for the UTF-8 euro symbol, but not %u20AC, with the percentage icon.

I have given the proper headers for UTF-8, so I gues IE is just being rude doing things its own way...

Is there a PHP function that can catch this strange encoding and put it to normal htmlentity (hex,decimal or named). I could just string_replace() this single problem symbol, but I'd rather fix all possible conflicts at once.

Or should I simply replace %u with &#x disabling normal usage of %u?

Gijs P
  • 1,325
  • 12
  • 28

2 Answers2

5

%u20AC is Unicode-encoded data for which is generated by JavaScript escape() function MDN, ECMA262 to UTF8 for server-side processing.

Standard PHP urldecode() can not deal with it (it is a non-standard percent encoding WP), so you need to use an extended routine:

/**
 * @param string $string unicode and ulrencoded string
 * @return string decoded string
 */
function utf8_urldecode($string) {
    $string = preg_replace(
        "/%u([0-9a-f]{3,4})/i",
        "&#x\\1;",
        urldecode($string)
    );
    return html_entity_decode($string, ENT_XML1, 'UTF-8');
}

Also check if you can configure this behaviour for your TinyMCE.


References

hakre
  • 193,403
  • 52
  • 435
  • 836
  • Thank you very much! Any idea why this doesn't happen in Firefox? – Gijs P Apr 06 '12 at 12:29
  • @Webscrabbler: Nope. It's probably related to TinyMCE, I have no clue why TinyMCE want's to escape that character anyway. – hakre Apr 06 '12 at 23:30
  • The function works, but I'm getting 'html_entity_decode(): Passing null to parameter #2 ($flags) of type int is deprecated'. How to fix that? – Jeroen Steen Jul 18 '22 at 19:34
  • 1
    @JeroenSteen: Thanks for asking. Basically to use the correct default value of the `$flags` parameter, compare https://www.php.net/html_entity_decode . This is just a pretty old answer and it hasn't seen `declare(strict_types=1);` nor anthing PHP 8. Will update the answer for PHP 5.4+ and compatible with _strict types_. – hakre Jul 18 '22 at 20:33
0

20AC it's the HEX code of euro, so you can slove this problem easly just in your html file in stead of usign try to use this code €

Chlebta
  • 3,090
  • 15
  • 50
  • 99
  • My customers use the TinyMCE platform and don't know what unicode is. They need to be able to simply use € in IE – Gijs P Apr 06 '12 at 11:02
  • Sorry but i dont know other way, try to check TinyMCE Menu you will find it probly ? – Chlebta Apr 06 '12 at 11:06