1

i'm having a memory issue while testing a find/replace function.

Say the search subject is:

$subject = "

I wrote an article in the A+ magazine. It'\s very long and full of words. I want to replace every A+ instance in this text by a link to a page dedicated to A+.

";

the string to be found :

$find='A+';
$find = preg_quote($find,'/');

the replace function callback:

 function replaceCallback($match)
    {
      if (is_array($match)) {
          return '<a class="tag" rel="tag-definition" title="Click to know more about ' .stripslashes($match[0]) . '" href="?tag=' . $match[0]. '">' . stripslashes($match[0])  . '</a>';
      }
    }

and the call:

$result = preg_replace_callback($find, 'replaceCallback', $subject);

now, the complete searched pattern is drawn from the database. As of now, it is:

$find = '/(?![^<]+>)\b(voice recognition|test project reference|test|synesthesia|Superflux 2007|Suhjung Hur|scripts|Salvino a. Salvaggio|Professional Lighting Design Magazine|PLDChina|Nicolas Schöffer|Naziha Mestaoui|Nabi Art Center|Markos Novak|Mapping|Manuel Abendroth|liquid architecture|LAb[au] laboratory for Architecture and Urbanism|l'Arca Edizioni|l' ARCA n° 176 _ December 2002|Jérôme Decock|imagineering|hypertext|hypermedia|Game of Life|galerie Roger Tator|eversion|El Lissitzky|Bernhard Tschumi|Alexandre Plennevaux|A+)\b/s';

This $find pattern is then looked for (and replaced if found) in 23 columns across 7 mysql tables.

Using the suggested preg_replace() instead of preg_replace_callback() seems to have solved the memory issue, but i'm having new issues down the path: the subject returned by preg_replace() is missing a lot of content...

UPDATE:

the content loss is due to using preg_quote($find,'/'); It now works, except for... 'A+' which becomes 'A ' after the process.

Chad Birch
  • 73,098
  • 23
  • 151
  • 149
pixeline
  • 17,669
  • 12
  • 84
  • 109
  • Is your actual search string more complex? Because for this example you don't need preg anything - str_replace() would work – Peter Bailey Mar 31 '09 at 14:34
  • it is. In fact i'm calling this function for a lot of potential "tags". that's why i'm having to make it troubleproof _ which is a failure right now. – pixeline Mar 31 '09 at 14:49
  • What is the memory limit in your php.ini set to? – jmucchiello Mar 31 '09 at 14:56
  • i tried increasing it ini_set('memory_limit','50M'); still have the issue In the list of tags that i have to find i have french special characters, like "ê,é". Could it be the issue? – pixeline Mar 31 '09 at 15:16
  • Can you give us a better idea of your sample sizes? The pared-down example above is actually hurting our ability to help you right now. – Peter Bailey Mar 31 '09 at 15:28
  • ok i'm reediting the question post. Thanks for sticking around! – pixeline Mar 31 '09 at 15:39

3 Answers3

2

I'm trying to reproduce your error but there's a parse error that needs to be fixed first. Either this isn't enough code to be a good sample or there's genuinely a bug.

First of all, the value you store in $find is not a pull pattern - so I had to add pattern delimiters.

Secondly, your replace string doesn't include the closing element for the anchor tags.

$subject = "
I wrote an article in the A+ magazine. It'\s very long and full of words. I want to replace every A+ instance in this text by a link to a page dedicated to A+.
";

$find='A+';
$find = preg_quote($find,'/');

function replaceCallback($match)
{
  if (is_array($match)) {
      return '<a class="tag" rel="tag-definition" title="Click to know more about ' .stripslashes($match[0]) . '" href="?tag=' . $match[0]. '">' . stripslashes($match[0])  . '</a>';
  }
}

$result = preg_replace_callback( "/$find/", 'replaceCallback', $subject);

echo $result;

This code works, but I'm not sure it's what you want. Also, I have have strong suspicion that you don't need preg_replace_callback() at all.

Peter Bailey
  • 105,256
  • 31
  • 182
  • 206
  • can you elaborate your las statement? i'm really not good with regular expressions so your help is much appreciated. Thanks ! – pixeline Mar 31 '09 at 15:18
  • The only reason to use preg_replace_callback() in place of preg_replace() or preg_replace_all() is if you need to execute other statements on the matches values before replaces (such as htmlspecialchars()). – Peter Bailey Mar 31 '09 at 15:21
1

This here works for me, i had to change the preg match a bit but it turns every A+ for me into a link. You also are missing a </a> at the end.

$subject = "I wrote an article in the A+ magazine. It'\s very long and full of words. I want to replace every A+ instance in this text by a link to a page dedicated to A+.";

function replaceCallback($match)
{
    if (is_array($match)) 
    {
        return '<a class="tag" rel="tag-definition" title="Click to know more about ' .stripslashes($match[0]) . '" href="?tag=' . $match[0]. '">' . stripslashes($match[0])  . '</a>';
    }
}

$result = preg_replace_callback("/A\+/", "replaceCallback", $subject);

echo $result;
Ólafur Waage
  • 68,817
  • 22
  • 142
  • 198
0

Alright - I can see, now, why you're using the callback

First of all, I'd change your callback to this

function replaceCallback( $match )
{
    if ( is_array( $match ) )
    {
        $htmlVersion    = htmlspecialchars( $match[1], ENT_COMPAT, 'UTF-8' );
        $urlVersion     = urlencode( $match[1] );
        return '<a class="tag" rel="tag-definition" title="Click to know more about ' . $htmlVersion . '" href="?tag=' . $urlVersion. '">' . $htmlVersion  . '</a>';
    }
    return $match;
}

The stripslashes commands aren't going to do you any good.

As far as addressing the memory issue, you may want to break down your pattern into multiple patterns and execute them in a loop. I think your match is just too big/complex for PHP to handle it in a single call cycle.

Peter Bailey
  • 105,256
  • 31
  • 182
  • 206
  • Everything seems to run fine now with your suggestion, except that i can't get "A+" to display as "A+", it displays as "A ". But i can see it sitting correctly in the database so i guess the issue somewhere beyond that point. Thanks ! – pixeline Apr 01 '09 at 08:56
  • oups, i spoke too fast: the + sign is removed by the replaceCallback function. mmh. – pixeline Apr 01 '09 at 09:02