I am currently using preg_match_all()
to find all words that begin with a specific of preffix. For example, if the preffix is cat
, then catsup
would be considered a match whereas housecat
would not.
Once these instances and their offsets are found, I am cycling through them and essentially encapsulating them with an anchor tag.
(Question Continued Below Code)
//Escape all non-standard characters
$preffix = sanitizePreffix($part['modlnoPreffix']);
//All Words Starting with preffix string
$pattern = "/".$preffix.'/';
//Find Matches
preg_match_all($pattern , $item['body'], $matches,PREG_OFFSET_CAPTURE);
$matches = array_reverse($matches[0]);
if (count($matches)>0){
foreach ($matches as $match){
$text = $match[0];
$offset = (int)$match[1];
$endOffset = $offset + strlen($text);
$url = "/specsheet_getPreffixParts.php?m=".urlencode($text);
//Insert ending </a> Tag
$item['body'] = str_insert('</a>', $item['body'], $endOffset);
//Insert Starting <a ...> Tag
$item['body'] = str_insert("<a rel='".$url."' href='javascript:void(0);'>", $item['body'], $offset);
}
}
The one catch is that I need to check each resulting index to make sure that
- The result is not already linked like
<a href='...'>catsup</a>
- The result is not within the starting
<a>
tag itself like<a href='/part/catsup'> ... </a>
I'm sure I could easily create a function that would step backwards one character at a time searching for <a
and then step forward one character at a time looking for </a>
, but this seems a bit silly to me.
My question is: Is there a better way to do this? My initial instinct is to make this part of the initial search pattern used by preg_match_all
- in other words ....
How would I find all words that start with 'cat' but are not located between a '<a' and a '</a>'