1

I have a body of text stored as a string. There are multiple substrings that i want to replace with a substring of that substring. This is a typical substring that i want to replace (note there are multiple substrings i want to replace).

$String = "loads of text [[gibberish text|Text i Want]] more text  [[gibberish text|Text i Want]] more text [[if no separator then  just remove tags]]";

$String = deleteStringBetweenStrings("[[", "|", $String , true);

deleteStringBetweenStrings is a recursive function that will delete all code between the 2 substrings (including the substrings) this will do what i want for the first substring but goes a bit crazy after this.

function deleteStringBetweenStrings($beginning, $end, $string, $recursive)
{
    $beginningPos = strpos($string, $beginning);
    $endPos = strpos($string, $end);

if ($beginningPos === false || $endPos === false) 
{
    return $string;
}

$textToDelete = substr($string, $beginningPos, ($endPos + strlen($end)) - $beginningPos);

$string = str_replace($textToDelete, '', $string);

if (strpos($string, $beginning) && strpos($string, $end) && $recursive == true) 
{
    $string = deleteStringBetweenStrings($beginning, $end, $string, $recursive);
}
return $string;
}

Is there a more efficient way for me to do this?

Expected output = "loads of text Text i Want more text Text i Want more text if no separator then just remove tags"

Dan Hastings
  • 3,241
  • 7
  • 34
  • 71

3 Answers3

1

Regex and regex only....

Just use the below regex to match the text which you don't want and then replace it with an empty string.

(?<=\[\[)(?:(?!\]]|\|).)*\||\[\[|\]\]

DEMO

Code:

<?php
$str = "loads of text [[gibberish text|Text i Want]] more text [[gibberish text|Text i Want]] more text [[if no separator then just remove tags]]";
echo preg_replace("/(?<=\[\[)(?:(?!\]]|\|).)*\||\[\[|\]\]/m", "", $str);
?>

Output:

loads of text Text i Want more text Text i Want more text if no separator then just remove tags

How i figure it out?

  • (?<=\[\[) Looks after to [[ symbols.
  • (?:(?!\]]|\|).)* Match any character but not of ]] or | symbol zero or more times.
  • \| A literal | symbol. This ensures that the match must contain a | symbol before it reaches the closing parenthesis ]]
  • So the regex i explained previously would match gibberish text| only in this [[gibberish text|Text i Want]] type of string and it won't touch this [[if no separator then just remove tags]]
  • | OR
  • \[\[ Now match [[
  • | OR
  • \]\] Symbols. Removing all the matched characters will give you the desired output.
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
  • Doesnt work with the tilda. The following code returned from the demo worked '/(?<=\\[\\[)(?:(?!\\]]|\\|).)*\\||\\[\\[|\\]\\]/m'. This is really cool though. How did you figure this out? I have a bunch of other patterns that i could use preg_replace to fix. – Dan Hastings Nov 19 '14 at 10:47
  • 1
    Ok, i changed the delimiter. – Avinash Raj Nov 19 '14 at 10:48
1

Something like this should do the trick (whilst preserving the ability to add your own start and end strings):

function deleteStringBetweenStrings($start, $end, $string) {
    // create a pattern from the input and make it safe to use in a regular expression
    $pattern = '|' . preg_quote($start) . '(.*)' . preg_quote($end) . '|U';
    // replace every occurrence of this pattern with an empty string in full $string
    return preg_replace($pattern, '', $string);
}


$String = "loads of text [[gibberish text|Text i Want]] more text  [[gibberish text|Text i Want]] more text [[if no separator then  just remove tags]]";

$String = deleteStringBetweenStrings("[[", "|", $String);
RichardBernards
  • 3,146
  • 1
  • 22
  • 30
  • All the solutions have worked, but the changes you have made to the function have made my life so easy now. Not only has this solved the issue, but its fixed so many other things with removing stuff from a string. I wish i understood this, but it works so im happy! thank you :D – Dan Hastings Nov 19 '14 at 10:52
  • @DanHastings I have added some comment to the code which will probably help you understand it better. When you research google for `php preg_replace example` you will probably understand it better ;) – RichardBernards Nov 19 '14 at 10:56
0

Try this one:

$string = 'loads of text [[gibberish text|Text i Want]] more text  [[gibberish text|Text i Want]] more text [[if no separator then  just remove tags]]';

function doReplace($matches) {
    $str = $matches[2];
    if (strpos($str, '|')) {
        $parts = explode('|', $str);
        return $parts[1];
    } else {
        return $str;
    }
}
echo preg_replace_callback('/(\[\[(.*?)\]\])/', 'doReplace', $string);

It echoes

loads of text Text i Want more text Text i Want more text if no separator then just remove tags

Which I think it's exactly what you want!

motanelu
  • 3,945
  • 1
  • 14
  • 21