0

I'm trying to capture a string that is like this:

document.all._NameOfTag_ != null ;

How can I capture the substring:

document.all._NameOfTag_

and the tag name:

_NameOfTag_

My attempt so far:

if($_line_ =~ m/document\.all\.(.*?).*/)
{

}

but it's always greedy and captures _NameOfTag_ != null

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
user63898
  • 29,839
  • 85
  • 272
  • 514

3 Answers3

6

The lazy (.*?) will always match nothing, because the following greedy .* will always match everything.

You need to be more specific:

if($_line_ =~ m/document\.all\.(\w+)/)

will only match alphanumeric characters after document.all.

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
2

Your problem is the lazy quantifier. A lazy quantifier will always first try and rescind matching to the next component in the regex and will consume the text only if said next component does not match.

But here your next component is .*, and .* matches everything until the end of your input.

Use this instead:

if ($_line_ =~ m/document\.all\.(\w+)/)

And also note that it is NOT required that all the text be matched. A regex needs only to match what it has to match, and nothing else.

fge
  • 119,121
  • 33
  • 254
  • 329
0

Try the following instead, personal I find it much clearer:

document\.all\.([^ ]*)
Chris Seymour
  • 83,387
  • 30
  • 160
  • 202
  • 1
    +1, although that would fail if there was no whitespace between the tag name and the comparison operator (which would be legal). – Tim Pietzcker Jan 14 '13 at 12:52
  • 1
    @TimPietzcker Ahh yes true, which bring us nicely to the point that it seems the OP is trying parse a non-regular language. – Chris Seymour Jan 14 '13 at 12:54