0

I have a regex pattern:

\(\s*\'\s*(.*?)\s*\'\)

This pattern means, get any text between ('TEXT').

There is a problem: The text may have a HTML tags.

So I want a pattern. If it didn't find a HTML tags, get the text normally, but if it found a HTML tags, the pattern get the text between the tags.


Example:

If the text is

('foo foo text here')

the pattern gets:

foo foo text here


And if the text is:

('<div class='test'> foo foo text here </div>')

the pattern gets

foo foo text here

So the pattern ignore the HTML tags (if there is any), and grab the text .

nhahtdh
  • 55,989
  • 15
  • 126
  • 162
Abdullah Raid
  • 545
  • 1
  • 4
  • 14

2 Answers2

4

You can call strip_tags() inside your preg_match(). That will turn:

('<div class='test'> foo foo text here </div>')

Into:

( 'foo foo text here' )

Then your regex as you designed it will remove the parens.

preg_match("/\(\s*\'\s*(.*?)\s*\'\)/", strip_tags($yourstring), $matches);
Michael Berkowski
  • 267,341
  • 46
  • 444
  • 390
0

I believe this works as well:

>\s*(.*?)\s*</|\(\s*\'(?!<)\s*(.*?)\s*\'\)

Although it does capture to two different capture groups.

At least it might be another option :-)

Nathan Fox
  • 4,023
  • 1
  • 23
  • 18