-2

this is my first question on stackoverflow so please bare with me here. Also I am not a native english speaker.

(16.02.2022) ANSWER (https://regex101.com/r/4FRznK/1 from Comment on Answer). Special thanks to Casimir et Hippolyte for your help! I wish I could reach out to you.

\[if \s+ (?<cond> [^]]* ) ]

(?<content> [^[]*+ 
        (?: (?R) [^[]*
          | \[ (?! /if] | else (?:if)?  \b) [^[]*
        )*+
)
(?<rest> 
    (?: \[elseif \s+ [^]]* ] \g<content> )*+
    (?: \[else] \g<content> )?+
    \[/if]
)

(15.02.2022) UPDATE: I fiddled around with the solutions presented below and have gotten farther with it. Seems like there is a limit on string length to comfortably match without any catastophic backtracking.

I have updated my Regex101 to show the recent progress. Maybe one of you has an idea on how to tackle this. https://regex101.com/r/wYzA3e/4

SIDE NOTE: I do have a working function for this, but my goal is to find a faster solution in terms of optimization and reliability. My current functions takes (in my opinion) way to long to complete the task and relies havily on strpos to get me there. I do not actually want to use third-party functions if there is a cheaper (in terms of performance) solution with PHPs internal functions. So even if you advice me to use alternative approaches, please be kind and provide hints on which explicitly you mean by these. Thank you!

(14.02.2022) Original: I am presented with the following difficulties with my regular expression: This is the string ("true" and/or "false" are not actually in the string but it helps with simplification):

**[if true]**
    [if true]
        [if false]
        [else]
        [/if]
    *[elseif false]*
        [if true]
        [/if]
    [else]
    [/if]
**[elseif false]**
[/if]
**[if false]**
    [if false]
    [else]
    *[/if]*
**[else]**
    [if true]
    [/if]
[/if]

I marked the wanted matches (**) and the ones i got (*)

In this situation I do only want to match the most outer parent [if XXXX].([else]|[elseif XXX]|[/if]) statement with its according end which can be [else], [elseif XXX] or [/if]. For now i do not care about the inner [if XXX] since when the parent is false i dont need to check for them.

When running my regex:

/\[if (.*?)\](((?R)|.)*?)(\[\/if\]|\[else\]|\[elseif )/gs 

it matches the parents [if XXX] and an incoherent combination of any [elseif XX], [else], [/if] in it.

As groups I do need the match > every X [if XXX] > the content between [if XXX] and the matching [END] as well as the [END].

Since i do not fully understand Recursion I´d appreciate your help. Many thanks in advance!

You can try the regex here (updated): https://regex101.com/r/wYzA3e/4

2 Answers2

0

This might be close?
It also captures the outer closing tag.
But don't see how to avoid that, without breaking the recursion.

\[(if|elseif|else) ?(.*?)\](((?R)|[^\[\]])*?)(?:.(?=\[else.*?\])|\[\/if\])

Test on regex101 here

LukStorms
  • 28,916
  • 5
  • 31
  • 45
  • Thank you for your answer. I will have a look into this tomorrow and let you know. Anyway. Thanks again. Great Community here! – Fastdesigner Feb 13 '22 at 20:22
  • Hey there. Just want to let you know, that I have tried your regex. Though it seems reliable it does not exactly match what I need in order to process the matches later on. I have updated my regex101 with the current state of development: https://regex101.com/r/wYzA3e/4 – Fastdesigner Feb 15 '22 at 09:45
0

When a pattern starts to be a little complicated, it's possible to use two features:

  • the verbose mode (x modifier)
  • references to subpatterns or better references to named subpatterns ( \g<name> )

Often with these two features things become more clear and the pattern is easier to build:

~
\[if \s+ [^]]* ]

(?<content> [^[]*+ (?: (?R) [^[]* )*+ )
(?: \[elseif \s+ [^]]* ] \g<content> )*+
(?: \[else] \g<content> )?+

\[/if]
~x

demo

Note that (?R) is nothing more that a reference to a subpattern except that this time the subpattern is the whole pattern.

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
  • Thank you for answer. I will evaluate this tomorrow but looks really promising! Just in case: Any other idea how to efficiently tackle this problem? – Fastdesigner Feb 13 '22 at 20:14
  • I tried your Regex. First up: I did not know that I am able to use new lines for more clarity. Thank you! This works great except that it does not capture the first "Ending"-Condition and I cannot get rid of the space in first group (). I took your Example and ideas and experimented with group names. All is well except the -Tag ([else],[elseif or [/if]) should also match. here is my *new* demo [link](https://regex101.com/r/RoQu2z/1) – Fastdesigner Feb 13 '22 at 20:39
  • I have updated your regex to fix some (minor) issues when matching and experimented with named groups. Even though it does work most of the times the backtracking of the current version seems to be overkill. Can you maybe have a look at the updated version here? https://regex101.com/r/wYzA3e/4 – Fastdesigner Feb 15 '22 at 09:48
  • @Fastdesigner: like that: https://regex101.com/r/4FRznK/1 – Casimir et Hippolyte Feb 15 '22 at 19:17
  • @Fastdesigner: catastrophic backtracking is only due to the changes you made (*"going farther"*). – Casimir et Hippolyte Feb 15 '22 at 19:25