Questions tagged [regex-greedy]

The greedy regex property causes the regex engine to repeat a regex token as often as possible. Only if that causes the entire regex to fail, give up the last iteration, and proceed with the remainder of the regex. The greedy regex tokens are `+`, `*`, `?` and the repetition using curly braces.

Example of Greediness

Using a regex to match an HTML tag the regular expression does not need to exclude any invalid use of sharp brackets. An HTML tag will be anything between sharp brackets.

If the test string is the following:

This is a <EM>first</EM> test.

With the <.+> patterh a expected match would be <EM> and when continuing after </EM>. But he regex will match <EM>first</EM>.

The reason is that the plus is a greedy token.

969 questions
4
votes
2 answers

Regular expression in regards to question mark "lazy" mode

I understand the ? mark here means "lazy". My question essentially is [0-9]{2}? vs [0-9]{2} Are they same? If so, why are we writing the former expression? Aren't lazy mode more expensive performance wise? If not, can you tell the difference?
KTrover
  • 43
  • 1
  • 4
4
votes
6 answers

How can I fix my regex to not match too much with a greedy quantifier?

I have the following line: "14:48 say;0ed673079715c343281355c2a1fde843;2;laka;hello ;)" I parse this by using a simple regexp: if($line =~ /(\d+:\d+)\ssay;(.*);(.*);(.*);(.*)/) { my($ts, $hash, $pid, $handle, $quote) = ($1, $2, $3, $4,…
Lasse A Karlsen
  • 801
  • 5
  • 14
  • 23
4
votes
3 answers

Regular Expression, greedy over |

for some weeks im working with regular expressions in php. Now my question: Is there any way, to make the RegEx greedy over | ? for example subject: 012345abcdefghijklm pattern: /(abcde|abcdefghi)/ will extract abcde, although abcdefghi is the…
guest1
  • 53
  • 4
4
votes
1 answer

Backtrack in Possesive Quantifier

Earlier I posted a question about regex which resulted in stackoverflow error in java. My Regex was greedy and many commented to use possessive quantifier in regex. So, I started learning Possessive quantifier in regex. I tried to match string…
Krishna M
  • 1,135
  • 2
  • 16
  • 32
4
votes
2 answers

Using an external regex library from AWK

My question is inspired by an interesting question somebody asked at http://tex.stackexchange.com and my attempt to provide the AWK solution. Note AWK here means NAWK since as we know gawk != awk. I am reproducing a bit of that answer here. Original…
4
votes
4 answers

Regular expression greedy match not working as expected

I have a very basic regular expression that I just can't figure out why it's not working so the question is two parts. Why does my current version not work and what is the correct expression. Rules are pretty simple: Must have minimum 3…
Kelsey
  • 47,246
  • 16
  • 124
  • 162
4
votes
1 answer

Regex {} parsing

Hello I have a problem with parsing this text { { { {[system1];1;1;0.612509325}; {[system2];1;1;0.977659115}; {[system3];1;1;36.97828732969}; {[system4];1;1;61.43154423} };2.5469 …
Glock
  • 63
  • 4
3
votes
3 answers

Perl regular expression isn't greedy enough

I'm writing a regular expression in perl to match perl code that starts the definition of a perl subroutine. Here's my regular expression: my $regex = '\s*sub\s+([a-zA-Z_]\w*)(\s*#.*\n)*\s*\{'; $regex matches code that starts a subroutine. I'm also…
David Levner
  • 341
  • 1
  • 8
3
votes
4 answers

Can't get Perl regex to be non-greedy

My regex matches the last set of alpha characters in the line, regardless of what I do. I want it to match only the first occurrence. I have tried using the non-greedy operator, but it stubbornly matches the right-most set of alpha characters, in…
3
votes
1 answer

Regex to individuate entries separate by specific character

I'm using Google Sheets and attempting to individuate the entries separated by [char10] > Sample content from a cell: > low confidence registrar [char10] > No SSL certificate [char10] > Malicious intent [char10] > Complete sentence #4…
3
votes
4 answers

How do I find the last set of digits in a string

So let's say I have a string "Happy 2022 New 01 years!" I'm looking to return the "01". To be more specific, I need the last set of digits in the string. This number could just be '1', or '10', or '999'... The string otherwise could be pretty much…
3
votes
5 answers

sed and Perl regexp replaces once, with multiple replacements flag

I have the string: lopy,lopy1,sym,lopy,lopy1,sym" I want the line to be: lopy,lopy1,sym,lady,lady1,sym Which means that all "lad" after the string sym should be replaced. So I ran: echo "lopy,lopy1,sym,lopy,lopy1,sym" | sed -r…
user1134991
  • 3,003
  • 2
  • 25
  • 35
3
votes
3 answers

regex for allowing alphanumeric, special characters and not ending with @ or _ or

I am new to regex , I created below regex which allows alpha numeric and 3 special characters @._ but string should not end with @ or . or * ^[a-zA-Z0-9._@]*[^_][^.][^@]$ it validates abc@ but fails for abc.
3
votes
1 answer

How do greedy / lazy (non-greedy) / possessive quantifiers work internally?

I noticed that there are 3 different classes of quantifiers: greedy, lazy (i.e. non-greedy) and possessive. I know that, loosely speaking, greedy quantifiers try to get the longest match by first reading in the entire input string and then truncate…
J-A-S
  • 368
  • 1
  • 8
3
votes
1 answer

R: Match an odd number of repetitions

I would like to match a string like \code, but not when the backslash is escaped. I think that one way of doing this could be matching an odd number of backslashes. Then for example, assuming \code is an expression to be replaced by 1234: \code…
antonio
  • 10,629
  • 13
  • 68
  • 136