2

Test string:

$s = "convert this: ";
$s .= "–, —, †, ‡, •, ≤, ≥, μ, ₪, ©, ® y ™, ⅓, ⅔, ⅛, ⅜, ⅝, ⅞, ™, Ω, ℮, ∑, ⌂, ♀, ♂ ";
$s .= "but, not convert ordinary characters to entities";
texai
  • 3,696
  • 6
  • 31
  • 41
  • 3
    But what for? This shouldn't be necessary if the document is properly encoded. – Pekka Feb 25 '11 at 23:01
  • @Pekka: My problem isn't render the data, my problem is store it. I can't change the db structure nor config fields. – texai Feb 25 '11 at 23:21
  • 1
    If your database can't store non-ASCII characters, you need to fix the database, not kludge your data into some ad-hoc encoded format. Keep database strings in raw form. – bobince Feb 26 '11 at 09:47

3 Answers3

11
$encoded = mb_convert_encoding($s, 'HTML-ENTITIES', 'UTF-8'); 

asssuming your input string is UTF-8, this should encode most everything into numeric entities.

Marc B
  • 356,200
  • 43
  • 426
  • 500
0

Well htmlentities doesn't work correctly. Fortunately someone has posted code on the php website that seems to do the translation of multibyte characters properly

Byron Whitlock
  • 52,691
  • 28
  • 123
  • 168
0

I did work on decoding ascii into html coded text (&#xxxx). https://github.com/hellonearthis/ascii2web

Hellonearthis
  • 1,664
  • 1
  • 18
  • 26