4

I am new to regular expressions, I have a text like this:

test{{this should not be selected and the curly brackets too}} but this one { or } should be selected. So I want to exclude all text between an opening and closing curly brackets.

and I want this result

"test"

and

"but this one { or } should be selected. So I want to exclude all text between an opening and closing curly brackets."

This is my expression I used:

$p = '/[a-zA-Z0-9#\' ]+(?![^{{]*}})/';

But this excludes the single curly brackets.
I want to know how to include the single curly brackets with the text and exclude only text between two curly brackets
And please can you give me some good documentation about regex expression? I want to learn more about this.

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
Mana
  • 167
  • 3
  • 13

4 Answers4

1
(?:^|(?:}}))(.+?)(?:$|{{)

Try it: https://regex101.com/r/2Xy7gU/1/
What is happening here:

  • (?:^|(?:}})) - it starts with either beginning of string or }}
  • (.+?) - matches everything (ungreedy)
  • (?:$|{{) - match have to end with either end of string or {{

What you want (without the brackets) is in first group.

ja2142
  • 892
  • 15
  • 23
  • I want to exclude the two brackets too, otherwise, is there any solution to unmatch the text if it is only between opening and closing brackets.. Because I just tested your solution and seems like it wont match text if it finds two opening brackets , try this text to understand what I mean:"test{{this should not be selected and the curly brackets too}} but this o}}ne { or } should be selected. So I want to exclude all text between an {{opening and closing curly brackets." – Mana Jun 03 '17 at 15:00
  • My solution works if the brachets are matched (there is the same amount of openning and closing ones) and if they aren't nested. If you want this to work for more complex grouping you can search for {{ and }} and exclude matches that includes said groups – ja2142 Jun 03 '17 at 15:33
  • Or you can just insert negative lookahead: `(?:^|(?:}}))((?:(?!}}).)+?)(?:$|{{)` https://regex101.com/r/44DDqO/1 Is this what you need? – ja2142 Jun 03 '17 at 21:01
1

Input (I doubled the string for effect):

$string = 'test{{this should not be selected and the curly brackets too}} but this one { or } should be selected. So I want to exclude all text between an opening and closing curly brackets. test{{this should not be selected and the curly brackets too}} but this one { or } should be selected. So I want to exclude all text between an opening and closing curly brackets.';

Method #1 preg_split():

var_export(preg_split('/{{[^}]*}}/', $string, 0, PREG_SPLIT_NO_EMPTY));
// Added the fourth param in case the input started/ended with a double curly substring.

Method #2 preg_match_all():

var_export(preg_match_all('/(?<=}{2}|^)(?!{{2}).*?(?={{2}|$)/s', $string, $out) ? $out[0] : []);

Output (either way):

array (
  0 => 'test',
  1 => ' but this one { or } should be selected. So I want to exclude all text between an opening and closing curly brackets. test',
  2 => ' but this one { or } should be selected. So I want to exclude all text between an opening and closing curly brackets.',
)

preg_split() treats the double curly wrapped substrings as "delimiters" and splits the full string on them.


The preg_match_all() method pattern... Pattern Demo This uses a positive lookbehind and a positive lookahead both of which hunt for double curlies or start/end of string. It uses a negative lookahead in the middle to avoid matching unwanted double-curly strings at the start of a new line. Finally the s modifier at the end of the pattern will allow . to match newline characters.

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
  • Not sure she/he wants to join the different parts *(see the question)*. – Casimir et Hippolyte Jun 03 '17 at 15:19
  • @Mana Are you also happy with the preg_match_all version? you seem to really want that one. When you are testing your patterns, use https://regex101.com because it tells you when you have syntax error and explains what your pattern is doing. http://www.regular-expressions.info/ is good place to read up on things – mickmackusa Jun 03 '17 at 15:43
  • @Mana Beyond those websites, of course, there are many other sites that are highly valuable. To be perfectly honest, all of those manuals start to turn your brain to mush if you stare at them for too long; if you want regex to really sink in, practice with real situations. SO questions have done wonders for my understanding of regex. If you'd like me to recommend any insanely awesome SO users to follow around, start with: Wiktor Stribiżew and Casimir et Hippolyte . I've never, ever seen these guys get stumped and they know heaps of tricks. – mickmackusa Jun 03 '17 at 16:30
  • @Mana I've checked all of the currently posted patterns on this page, and while many of them are correct, none of them are more efficient (in terms of "steps"). Since your goal is to self-educate, you should try to understand them all. Other method considerations are pattern brevity as well as output size. When using capture groups, the output array increases by a minimum of 100%, I would advice you to always seek a capture-less pattern when possible. – mickmackusa Jun 04 '17 at 00:42
0

Use preg_replace and replace all occurrences of \{\{[^\}]*\}\} with empty string.

Example: http://www.regextester.com/?fam=97777

Explanation:

\{      - {
\{      - {
[^\}]*  - everything except }
\}      - }
\}      - }
Adam
  • 5,403
  • 6
  • 31
  • 38
  • I need to use preg_match_all and get the piece of text that is not inside the two curly brackets. is there anyway to use negation with your solution? – Mana Jun 03 '17 at 14:39
0

2 options:

  • easy: just consider the blocks between {{ }} as a split pattern
    $validblocks = preg_split("/{{[\w .]+}}/", $str);
  • complicated: use groups and first capture rejcted pattern, then what remains:
    (?<novalid>{{[\w ]+}})|(?<valid>{|[\w .]*|})
    manage it as you want afterward. Example here: https://regex101.com/r/SK729o/2
mquantin
  • 1,085
  • 8
  • 23