6

I would like to count the length of a string with PHP. The string contains HTML entity numbers, which inflate the number of characters that are counted: a dash is – which is counted as 7 when I only want it to count as 1.

How do I convert the html numbered entities to a form where special characters are only counted with a length of 1?

Example string:

Goth-Trad – ‘Cosmos’

The code:

$string = html_entity_decode('Goth-Trad – ‘Cosmos’');
    echo strlen($string);

produces '38', when I'm looking for '20'. What is going wrong?

Squrler
  • 3,444
  • 8
  • 41
  • 62
  • Even though you have used a `htmlentities` tag, didn't you see *See Also* part of http://php.net/htmlentities – Shiplu Mokaddim Jan 02 '12 at 14:05
  • 1
    Unfortunately, the documentation did not provide me with the result I was looking for, which is why I'm posting the question on SO. I would appreciate it if you don't immediately downvote without knowing the background of the question. – Squrler Jan 02 '12 at 14:11

3 Answers3

5

You can use this:

$html = 'Goth-Trad – ‘Cosmos’';
echo strlen(utf8_decode(html_entity_decode($html, ENT_COMPAT, 'utf-8')));
Peter Krejci
  • 3,182
  • 6
  • 31
  • 49
4

Just decode it and count the decoded one?

$string = html_entity_decode("Goth-Trad – ‘Cosmos’",ENT_QUOTES,"UTF-8");
echo strlen($string);
Damien Pirsy
  • 25,319
  • 8
  • 70
  • 77
  • Unfortunately this exact code fragment still produces 38, when it should be 20. Any idea what could be going wrong? – Squrler Jan 02 '12 at 14:12
  • @Squrler you're right, I just tried it out and the php function cannot decode the entities. The code is right, though. I'll investigate the issue – Damien Pirsy Jan 02 '12 at 14:21
  • @Damien, thanks! Peter below has just updated his answer which produces the results I was looking for. Thanks again for answering! – Squrler Jan 02 '12 at 14:24
  • @Squrler setting the encoding to UTF-8 looked like solving the problem? I get 26 though, see here http://codepad.org/iizpEyVX – Damien Pirsy Jan 02 '12 at 14:27
  • @Squrler DAmn, I've been too slow :) – Damien Pirsy Jan 02 '12 at 14:28
-1

Please Try with the following coding function:

<?php   

$string='Goth-Trad &#8211; &#8216;Cosmos&#8217;'; 

echo html_entity_text_length($string); // Calling the function 

//html_entity_text_length function start

function html_entity_text_length($string){
    preg_match_all("/&(.*)\;/U", $string, $pat_array);
    $additional=0;
    foreach ($pat_array[0] as $key => $value) {
       $additional += (strlen($value)-1);
    }

    $limit+=$additional;
    return  strlen($string)-$limit;
}

//html_entity_text_length function end

?>