19

I am trying to convert HTML entities from a source string to their literal character equivalent.

For example:

<?php

$string = "Hello &#8211; World";
$converted = html_entity_decode($string);

?>

Whilst this rightly converts the entity on screen, when I look at the HTML code it is still showing the explicit entity. I need to change that so that it literally converts the entity as I am not using the string within an HTML page.

Any ideas on what I am doing wrong?

FYI I am sending the converted string to Apple's Push notification service:

$payload['aps'] = array('alert' => $converted, 'badge' => 1, 'sound' => 'default');
$payload = json_encode($payload);
animuson
  • 53,861
  • 28
  • 137
  • 147
mootymoots
  • 4,545
  • 9
  • 46
  • 74

2 Answers2

38

&#8211; maps to a UTF-8 character (the em dash) so you need to specify UTF-8 as the character encoding:

$converted = html_entity_decode($string, ENT_COMPAT, 'UTF-8');
BoltClock
  • 700,868
  • 160
  • 1,392
  • 1,356
  • 2
    I still get the entity when I view source on that one...? – mootymoots Jan 09 '11 at 10:12
  • @mootymoots: I tested it, I got the raw character instead of the entity. Wonder what else could be causing it... the HTML document's encoding perhaps? – BoltClock Jan 09 '11 at 10:13
  • it's converted on the page - but not in the source...? Looking in chrome – mootymoots Jan 09 '11 at 10:16
  • Just to add, the PHP is sending it via json_encode to Apple, it's not actually needed to be viewed in browser, it's just helping me debug. It comes through as the entity on the device. – mootymoots Jan 09 '11 at 10:17
  • Scratch that comment, you're using APNS. So that means your alert view is displaying `–` as well, right? – BoltClock Jan 09 '11 at 10:19
  • exactly :) That's what I'm trying to fix :) – mootymoots Jan 09 '11 at 10:19
  • @mootymoots: What's the output of `$payload` as JSON? – BoltClock Jan 09 '11 at 10:21
  • {"aps":{"alert":"Hello – World","badge":1,"sound":"default"}} – mootymoots Jan 09 '11 at 10:23
  • Wow, that's very strange. That string is supposed to read `"Hello \u2264 World"`. Any chance you might be overwriting `$converted` somehow between the decoding and the array? Maybe posting the full script would help... unless that's your full script. – BoltClock Jan 09 '11 at 10:26
  • It's fixed. I was using htmlentities() on the source string as without it (in the browser) things went mental. When I removed that and sent to Apple it works fine, Just terrible in a browser :) – mootymoots Jan 09 '11 at 10:30
  • And it was only mental in a browser because I didnt use the right charset for the HTML page... damn! – mootymoots Jan 09 '11 at 10:31
  • @mootymoots: There's the problem :) Glad you got it sorted. – BoltClock Jan 09 '11 at 10:34
5

Try using charset

<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> 
<?php
$string = "Hello &#8211; World";
$converted = html_entity_decode($string , ENT_COMPAT, 'UTF-8');
echo $converted;
?>

This should work And it should be converted also in the source

mr.Shu
  • 478
  • 5
  • 9