2

Certain characters have special significance in HTML, and should be represented by HTML entities if they are to preserve their meanings.

That, with the limited knowledge that I have, can be easily done in two different ways in PHP. Like this:

<?php

   $some_code = '<a href="#test">Test</a>';

   echo '<pre><code>' . htmlspecialchars( $some_code, ENT_QUOTES ) . '</code></pre>';

?>

Or this way:

<?php

   $some_code = '<a href="#test">Test</a>';

   echo '<pre><code>' . str_replace( array('<', '>', '&', '\'', '"'), array('&lt;', '&gt;', '&amp;', '&apos;', '&quot;'), $some_code ) . '</code></pre>';

?>

(That's just to show you what I am trying to do, and not how I am doing it in reality. For example, the $some_code is provided dynamically, not manually.)

Not considering how much easier it is to simply use htmlspecialchars() over str_replace(), which one of the two would be a better choice for what I am trying to do? (In terms of performance, that is.)


UPDATE

Okay, I see that this needs more context. This is what I am actually trying to do:

<?php

    $some_code = '<a href="#test">Test</a>';

    echo '<pre><code>' . str_replace(

        // Replace these special characters
        array( '<', '>', '&', '\'', '"', '‘', '’', '“', '”', '/', '[', ']' ),

        // With the HTML entities below, respectively
        array('&lt;', '&gt;', '&amp;', '&apos;', '&quot;', '&apos;', '&apos;', '&quot;', '&quot;', '&quot;', '&#47;', '&#91;', '&#93;'),

        $some_code

    ) . '</code></pre>';

?>

VERSUS:

<?php

    $some_code = '<a href="#test">Test</a>';

    return '<pre><code>' . str_replace(

        array( '‘', '’', '“', '”', '/', '[', ']' ),

        array('&apos;', '&apos;', '&quot;', '&quot;', '&quot;', '&#47;', '&#91;', '&#93;'),

        htmlspecialchars( $content, ENT_QUOTES )

    ) . '</code></pre>';

?>
its_me
  • 10,998
  • 25
  • 82
  • 130
  • This is what htmlspecialchars() are ment for, no need to complicate it by doing it different unless you are tayloring something special. If you suspect that you might change it in the future, make your own Html_Spes function with htmlspecialchars inside, so you only have to alter it in one place. – Tom Sep 28 '13 at 15:01
  • @Tom Please take a look at the update in my question. I actually need to replace more characters; guess I should have mentioned that initially. – its_me Sep 28 '13 at 15:06

2 Answers2

1

You should move & and &amp; to the start of each array to avoid double-escaping. After that, I’d suggest using just str_replace, since it makes what you’re trying to do more obvious (to me, anyways — nested function calls can be confusing!) but it’s really up to you. The performance difference won’t be noticeable; a string that big would cause other problems.

Ry-
  • 218,210
  • 55
  • 464
  • 476
  • _"Your `str_replace` is going to double-escape `<` and `>`."_ I don't understand. Could you please clarify a bit? – its_me Sep 28 '13 at 15:02
  • @its_me: Passing an array to `str_replace` performs the replacements in order. `<` and `>` get turned into `<` and `>`, and then the next replacement turns those into `&lt;` and `&gt;`. http://codepad.viper-7.com/usA85U – Ry- Sep 28 '13 at 15:03
  • In that case, what would the right way to do it? Also please take a look at the update in my question. It should give you more perspective on what I am actually trying to do. (+1) – its_me Sep 28 '13 at 15:08
  • @its_me: You would just make `&` and `&` the first elements in the array. Also, is the difference between your two examples the fact that there are smart quotes? (i.e. `“` instead of `"`?) Why do you need to escape those? – Ry- Sep 28 '13 at 15:09
  • 1
    @its_me: Anyways, in that case, pick either one. I’d go for `str_replace`, because it’s more clear that you want to replace a whole set of characters. The `htmlspecialchars` is easier to miss. – Ry- Sep 28 '13 at 15:10
  • The difference is that I also need to escape smart quotes, forward slash and square brackets (required to work without issues in my application). As for escaping smart quotes, some code on the web has them instead of straight quotes. I want to auto-fix them without having to depend on my users. – its_me Sep 28 '13 at 15:11
1

You definitely should go with htmlspecialchars(). I made few benchmarks and got the result as for 100000 loops

htmlspecialchars took 0.15800881385803 to finish
htmlentities took 0.20201182365417 to finish
str_replace took 0.81704616546631 to finish 

You can check it yourself by this code

<?php
$orgy = '<div style="background:#ffc">Hello World</div>';
$startTime = microtime(true);
for($i=0; $i<100000; $i++)
{
    $tmp = htmlspecialchars($orgy);
}
echo "htmlspecialchars took " . (microtime(true) - $startTime) . " to finish<br />";

$startTime = microtime(true);
for($i=0; $i<100000; $i++)
{
    $tmp = htmlentities($orgy);
}
echo "htmlentities took " . (microtime(true) - $startTime) . " to finish<br />";

$startTime = microtime(true);
for($i=0; $i<100000; $i++)
{
    $tmp = str_replace(array('&','<','>','\\','/','"','\''), array('&amp;','&lt;','&gt;','&#92;','&#47;','&quot;','&#039;'), $orgy);
}
echo "str_replace took " . (microtime(true) - $startTime) . " to finish\n";
?>
Airy
  • 5,484
  • 7
  • 53
  • 78