7

Here I have 2 methods using str_replace to replace strings in a given phrase.

// Method 1
$phrase  = "You should eat fruits, vegetables, and fiber every day.";
$healthy = array("fruits", "vegetables", "fiber");
$yummy   = array("pizza", "beer", "ice cream");
$phrase = str_replace($healthy, $yummy, $phrase);

// Method 2
$phrase  = "You should eat fruits, vegetables, and fiber every day.";
$phrase = str_replace("fruits", "pizza", $phrase);
$phrase = str_replace("vegetables", "beer", $phrase);
$phrase = str_replace("fiber", "ice cream", $phrase);

Which method is more efficient (in terms of execution time & resources used)?

Assume the real phrase is much longer (e.g. 50,000 characters), and the words to replace have a lot more pairs.

What I am thinking is that Method 2 calls str_replace 3 times, which will cost more function calls; on the other hand Method 1 create 2 arrays, and the str_replace needs to parse 2 arrays in runtime.

Raptor
  • 53,206
  • 45
  • 230
  • 366
  • 1
    neither is a good choice, if you have a long string and repeatedly need to str_replace, why not you save the result after str_replace? – ajreal Dec 13 '11 at 07:20
  • If you create ARRAYs healty and yummy over and over again in the loop it's slower, not if you put them outside. – djot Dec 13 '11 at 07:33
  • 1
    You spent 10x longer asking this question than the difference it would make in 100's of thousands of executions. YOUR time is more valuable than such pointless optimizations ;) – landons Dec 13 '11 at 07:39
  • 1
    @landons Incorrect. I'm working on a strict Key Performance Index (KPI), which every millisecond is important. – Raptor Dec 13 '11 at 09:32
  • WHY DO THIS QUESTION & ANSWERS SCORE SO MANY -1 ? – Raptor Dec 13 '11 at 09:34
  • @ShivanRaptor Because you're optimizing something that really can't be optimized. Stuff like "is the array initialized within or without the loop?" could be important with a very tight loop, but probably not "should I use string arguments or array arguments"? At this pace, it will take you forever to finish, whether or not you have an acronym-ed reason for asking the question. – landons Dec 13 '11 at 21:43
  • Unless you're going to be running str_replace with huge searches and/or replaces on a huge number of huge strings then this question is pretty much moot and amounts to a [tag:micro-optimization]. Whether your application is fast or slow will almost certainly not hinge on how you use str_replace. – GordonM Aug 23 '16 at 15:16

4 Answers4

6

I would prefer to use method 1 as its cleaner and more organised also Method 1 gives opportunity to use pairs from other source eg: bad words table in database. Method 2 would require another loop of sort..

<?php
$time_start = microtime(true);
for($i=0;$i<=1000000;$i++){
    // Method 1
    $phrase  = "You should eat fruits, vegetables, and fiber every day.";
    $healthy = array("fruits", "vegetables", "fiber");
    $yummy   = array("pizza", "beer", "ice cream");
    $phrase = str_replace($healthy, $yummy, $phrase);
}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 1 in ($time seconds)\n<br />";



$time_start = microtime(true);
for($i=0;$i<=1000000;$i++){
    // Method2
    $phrase  = "You should eat fruits, vegetables, and fiber every day.";
    $phrase = str_replace("fruits", "pizza", $phrase);
    $phrase = str_replace("vegetables", "beer", $phrase);
    $phrase = str_replace("fiber", "ice cream", $phrase);

}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 2 in ($time seconds)\n";
?>  

Did Test 1 in (3.6321988105774 seconds)

Did Test 2 in (2.8234610557556 seconds)


Edit: On further test string repeated to 50k, less iterations and advice from ajreal, the difference is so miniscule.
<?php
$phrase  = str_repeat("You should eat fruits, vegetables, and fiber every day.",50000);
$healthy = array("fruits", "vegetables", "fiber");
$yummy   = array("pizza", "beer", "ice cream");

$time_start = microtime(true);
for($i=0;$i<=10;$i++){
    // Method 1
    $phrase = str_replace($healthy, $yummy, $phrase);
}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 1 in ($time seconds)\n<br />";



$time_start = microtime(true);
for($i=0;$i<=10;$i++){
    // Method2
    $phrase = str_replace("fruits", "pizza", $phrase);
    $phrase = str_replace("vegetables", "beer", $phrase);
    $phrase = str_replace("fiber", "ice cream", $phrase);

}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 2 in ($time seconds)\n";
?>  

Did Test 1 in (1.1450328826904 seconds)

Did Test 2 in (1.3119208812714 seconds)

Lawrence Cherone
  • 46,049
  • 7
  • 62
  • 106
  • Yeah, but id sacrifice that 0.9 fraction of a second on 1mil iterations for better coding and scalability. – Lawrence Cherone Dec 13 '11 at 07:24
  • 3
    Can I suggest you to put the array declaration outside of the loop? – ajreal Dec 13 '11 at 07:27
  • This performance difference is about what I'd expect. Method 1 should be faster. I also expect that the difference will be great the more replacements you're doing. If you were to increase the number of items being replaced to 10 or 20, you'd probably see something. Also, `$phrase` can also be an array if you need to do the same replace on several strings. This can also be a big difference (I expect). – Okonomiyaki3000 Jan 21 '17 at 16:07
4

Even if old, this benchmark is incorrect.

Thanks to anonymous user:

"This test is wrong, because when test 3 starts $phrase is using the results of test 2, in which there is nothing to replace.

When i add $phrase = "You should eat fruits, vegetables, and fiber every day."; before test 3, the results are: Did Test 1 in (4.3436799049377 seconds) Did Test 2 in (5.7581660747528 seconds) Did Test 3 in (7.5069718360901 seconds)"

        <?php
        $time_start = microtime(true);

        $healthy = array("fruits", "vegetables", "fiber");
        $yummy   = array("pizza", "beer", "ice cream");

        for($i=0;$i<=1000000;$i++){
            // Method 1
            $phrase  = "You should eat fruits, vegetables, and fiber every day.";
            $phrase = str_replace($healthy, $yummy, $phrase);
        }
        $time_end = microtime(true);
        $time = $time_end - $time_start;
        echo "Did Test 1 in ($time seconds)<br /><br />";



        $time_start = microtime(true);
        for($i=0;$i<=1000000;$i++){
            // Method2
            $phrase  = "You should eat fruits, vegetables, and fiber every day.";
            $phrase = str_replace("fruits", "pizza", $phrase);
            $phrase = str_replace("vegetables", "beer", $phrase);
            $phrase = str_replace("fiber", "ice cream", $phrase);

        }
        $time_end = microtime(true);
        $time = $time_end - $time_start;
        echo "Did Test 2 in ($time seconds)<br /><br />";




        $time_start = microtime(true);
        for($i=0;$i<=1000000;$i++){
                foreach ($healthy as $k => $v) {
                  if (strpos($phrase, $healthy[$k]) === FALSE)  
                  unset($healthy[$k], $yummy[$k]);
                }                                          
                if ($healthy) $new_str = str_replace($healthy, $yummy, $phrase);

        }
        $time_end = microtime(true);
        $time = $time_end - $time_start;
        echo "Did Test 3 in ($time seconds)<br /><br />";

        ?>  

Did Test 1 in (3.5785729885101 seconds)

Did Test 2 in (3.8501658439636 seconds)

Did Test 3 in (0.13844394683838 seconds)

djot
  • 2,952
  • 4
  • 19
  • 28
3

Although not directly asked in the question, the OP does state:

Assume the real phrase is much longer (e.g. 50,000 characters), and the words to replace have a lot more pairs.

In which case, if you don't need (or want) replacements within replacements, it may be much more efficient to use a preg_replace_callback solution so that the entire string is only processed once, not once for each pair of replacements.

Here's a generic function which in my case with a 1.5Mb string and ~20,000 pairs of replacements was about 10x faster, although because it needed to split the replacements into chunks due to "regular expression is too large" errors, could have made replacements within replacements indeterminately (in my particular case this was not possible, however).

In my particular case I was able to further optimize this to about a 100x performance gain, because my search strings all followed a particular pattern. (PHP version 7.1.11 on Windows 7 32-bit.)

function str_replace_bulk($search, $replace, $subject, &$count = null) {
  // Assumes $search and $replace are equal sized arrays
  $lookup = array_combine($search, $replace);
  $result = preg_replace_callback(
    '/' .
      implode('|', array_map(
        function($s) {
          return preg_quote($s, '/');
        },
        $search
      )) .
    '/',
    function($matches) use($lookup) {
      return $lookup[$matches[0]];
    },
    $subject,
    -1,
    $count
  );
  if (
    $result !== null ||
    count($search) < 2 // avoid infinite recursion on error
  ) {
    return $result;
  }
  // With a large number of replacements (> ~2500?), 
  // PHP bails because the regular expression is too large.
  // Split the search and replacements in half and process each separately.
  // NOTE: replacements within replacements may now occur, indeterminately.
  $split = (int)(count($search) / 2);
  error_log("Splitting into 2 parts with ~$split replacements");
  $result = str_replace_bulk(
    array_slice($search, $split),
    array_slice($replace, $split),
    str_replace_bulk(
      array_slice($search, 0, $split),
      array_slice($replace, 0, $split),
      $subject,
      $count1
    ),
    $count2
  );
  $count = $count1 + $count2;
  return $result;
}
Jake
  • 948
  • 8
  • 19
  • Great! This is exactly what I was looking for! Tested the code and it gives O(kn) performance (k=string length, n is number of replacements), as opposed to O(kn²) performance for the str_replace option. – IanS Feb 22 '19 at 12:16
1

@djot you have an error in

<?php
     foreach ($healthy as $k => $v) {
        if (strpos($phrase, $healthy[$k]) === FALSE)  
             unset($healthy[$k], $yummy[$k]);
        }  

Here we have a fixed version and better/simple new test 4

<?php 
 $time_start = microtime(true);

        $healthy = array("fruits", "vegetables", "fiber");
        $yummy   = array("pizza", "beer", "ice cream");

        for($i=0;$i<=1000000;$i++){
            // Method 1
            $phrase  = "You should eat fruits, vegetables, and fiber every day.";
            $phrase = str_replace($healthy, $yummy, $phrase);
        }
        $time_end = microtime(true);
        $time = $time_end - $time_start;
        echo "Did Test 1 in ($time seconds)". PHP_EOL. PHP_EOL;



        $time_start = microtime(true);
        for($i=0;$i<=1000000;$i++){
            // Method2
            $phrase  = "You should eat fruits, vegetables, and fiber every day.";
            $phrase = str_replace("fruits", "pizza", $phrase);
            $phrase = str_replace("vegetables", "beer", $phrase);
            $phrase = str_replace("fiber", "ice cream", $phrase);

        }
        $time_end = microtime(true);
        $time = $time_end - $time_start;
        echo "Did Test 2 in ($time seconds)" . PHP_EOL. PHP_EOL;




        $time_start = microtime(true);
        for($i=0;$i<=1000000;$i++){
            $a = $healthy;
            $b = $yummy;
                foreach ($healthy as $k => $v) {
                  if (strpos($phrase, $healthy[$k]) === FALSE)  
                  unset($a[$k], $b[$k]);
                }                                          
                if ($a) $new_str = str_replace($a, $b, $phrase);

        }
        $time_end = microtime(true);
        $time = $time_end - $time_start;
        echo "Did Test 3 in ($time seconds)". PHP_EOL. PHP_EOL;



        $time_start = microtime(true);
        for($i=0;$i<=1000000;$i++){
            $ree = false;
            foreach ($healthy as $k) {
              if (strpos($phrase, $k) !== FALSE)  { //something to replace
                  $ree = true;
                  break;
              }
            }                                          
            if ($ree === true) {
                $new_str = str_replace($healthy, $yummy, $phrase);
            }
        }
        $time_end = microtime(true);
        $time = $time_end - $time_start;
        echo "Did Test 4 in ($time seconds)". PHP_EOL. PHP_EOL;

Did Test 1 in (0.38219690322876 seconds)

Did Test 2 in (0.42352104187012 seconds)

Did Test 3 in (0.47777700424194 seconds)

Did Test 4 in (0.19691610336304 seconds)

Emilio
  • 11
  • 2