Questions tagged [regex-greedy]

The greedy regex property causes the regex engine to repeat a regex token as often as possible. Only if that causes the entire regex to fail, give up the last iteration, and proceed with the remainder of the regex. The greedy regex tokens are `+`, `*`, `?` and the repetition using curly braces.

Example of Greediness

Using a regex to match an HTML tag the regular expression does not need to exclude any invalid use of sharp brackets. An HTML tag will be anything between sharp brackets.

If the test string is the following:

This is a <EM>first</EM> test.

With the <.+> patterh a expected match would be <EM> and when continuing after </EM>. But he regex will match <EM>first</EM>.

The reason is that the plus is a greedy token.

969 questions
4
votes
1 answer

RegEx for matching words only formed with a list of letters

Given a set of words, I need to know which words are formed only by a set of letters. This word can not have more letters than allowed, even if this letter is part of the verification set. Example: Char set: a, a, ã, c, e, l, m, m, m, o, o, o, o, t…
4
votes
2 answers

How to extract text between certain patterns using regular expression (RegEx)?

My text: 27/07/18, 12:02 PM - user_a: https://www.youtube.com/ Watch this 27/07/18, 12:15 PM - user_b: 27/07/18, 12:52 PM - user_b: Read this fully some text some text . some text 27/07/18, 12:56 PM - user_c: text .. Here I want to…
Kalsi
  • 579
  • 5
  • 13
4
votes
3 answers

Two greedy quantifiers in the same regex

If I have an unknown string of the structure: "stuff I don't care about THING different stuff I don't care about THING ... THING even more stuff I don't care about THING stuff I care about" I want to capture the "stuff I care about" which will…
noah
  • 2,616
  • 13
  • 27
4
votes
4 answers

Can't make non-greedy match work

In Python3.4, I'm using the re library (the regex library gives the same result), and I'm getting a result I don't expect. I have a string s = 'abc'. I would expect the following regex: re.match(r"^(.*?)(b?)(.*?)$", s).groups() ..to match with…
Mike Maxwell
  • 547
  • 4
  • 11
4
votes
2 answers

Practical use of possessive quantifiers regex

I understand .* (greedy quantifier) backtracks and tries to find a match. and .*+ (possessive quantifier) does not backtrack. However I have been using .* and .\*? frequently but don't know when to use .*+. Can somebody give a situation or an…
Arun Gowda
  • 2,721
  • 5
  • 29
  • 50
4
votes
1 answer

Does an atomic quantified group mean the same as a quantified atomic group?

I was looking at this answer to this question: Regex nested parentheses, and was thinking that instead of a quantified atomic group (?> list | of | alternates )* it should have been an atomic quantified group (?> (?: list | of | alternates )* ). Am…
Adrian
  • 10,246
  • 4
  • 44
  • 110
4
votes
2 answers

Non-greedy (lazy) matching using regex?

How do you implement non-greedy matching in Stata using regex? Or does Stata even have this capability? I want to extract all text that occurs between a hashtag "#" and a period ".". Example code: clear set obs 3 generate…
user812783765
  • 165
  • 1
  • 1
  • 7
4
votes
4 answers

Finding all matches with a regular expression - greedy and non greedy!

Take the following string: "Marketing and Cricket on the Internet". I would like to find all the possible matches for "Ma" -any text- "et" using a regex. So.. Market Marketing and Cricket Marketing and Cricket on the Internet The regex Ma.*et…
Rastaboy
  • 75
  • 1
  • 6
4
votes
1 answer

Regex for matching a string literal in Java?

I have an array of regular expressions strings. One of them must match any strings found in a given java file. This is the regex string I have so far: "(\").*[^\"].*(\")" However, the string "Hello\"good day" is rejected even though the quotation…
pythonbeginner4556
  • 313
  • 1
  • 5
  • 14
4
votes
2 answers

Understanding what makes this regexp so slow

I have a regexp: import re regexp = re.compile(r'^(?P(?:[\w-]+/?)+)/$') It matches a string like foo/bar/baz/ and put the foo/bar/baz in a group named parts (the /? combined with the /$ support this). This works perfectly fine, until you…
orokusaki
  • 55,146
  • 59
  • 179
  • 257
4
votes
2 answers

Regex for getting all digits in a string after a character

I am trying to parse the following string and return all digits after the last square bracket: C9: Title of object (foo, bar) [ch1, CH12,c03,4] So the result should be: 1,12,03,4 The string and digits will change. The important thing is to get…
4
votes
2 answers

Smallest possible match / nongreedy regex search

I first thought that this answer will totaly solve my issue, but it did not. I have a string url like this one: http://www.someurl.com/some-text-1-0-1-0-some-other-text.htm#id_76 I would like to extract some-other-text so basically, I come with the…
Delgan
  • 18,571
  • 11
  • 90
  • 141
4
votes
3 answers

Regex to avoid data duplication in delimited string?

I am trying to validate the data which will be string value with the , delimited. What I want is to validate that there should not be repetition of the same value within the sting. Ex. my value would be. data1 =…
user2745246
  • 304
  • 1
  • 3
  • 14
4
votes
1 answer

How to apply lazy quantifier in this given scenario?

I am trying to match an occurrence with the regex: to(.*?) CITY[\d] against John from beautiful CITY1 in sdfsf to dsfs in sf to abc CITY2 to CITY3 for 3 days I get two matches : to dsfs in sf to abc CITY2 to CITY3 My problem is that I want a…
RVP
  • 43
  • 3
4
votes
4 answers

A multi-line, variedly greedy, regular expression

Given the following text, what PCRE regular expression would you use to extract the parts marked in bold? 00:20314 lorem ipsum want this kryptonite 00:02314 quux padding dont want this 00:03124 foo neither this 00:01324 foo but…
l0_0
  • 43
  • 4