Questions tagged [non-greedy]

A technique used in regular expressions, that limits the matching text until all conditions of the given regex have been met. The operator "?" is added to the end of wildcard operations.

A regex is used to check if a string matches a certain pattern. Most regexes offer additional functionality to capture interesting parts of the string.

Example:

Say we have the following regular expression:

^(.*)([ab]+)$

The regex specifies a pattern: strings can start with any sequence of arbitrary characters, but should end with at least one a or b.

Wildcard operations are by default greedy. This means that the first group will aim to capture as much as possible (without losing the match) and only give up the remainder of the string if this is the only way to match the string with the pattern.

For instance the string foobaraabbabbababab will be captured as (foobaraabbabbababa)(b). In case we more interested in the ([ab]+), group, we can apply a non-greedy operator on the first group such that the remainder of the string is passed to the second group as soon as possible.

In case we use the following pattern:

^(.*?)([ab]+)$

The example will be matched as (foobar)(aabbabbababab)

Related tags:

188 questions
4
votes
2 answers

Smallest possible match / nongreedy regex search

I first thought that this answer will totaly solve my issue, but it did not. I have a string url like this one: http://www.someurl.com/some-text-1-0-1-0-some-other-text.htm#id_76 I would like to extract some-other-text so basically, I come with the…
Delgan
  • 18,571
  • 11
  • 90
  • 141
4
votes
1 answer

Non-greedy list parsing with pyparsing

I have a string consisting of a list of words which I am attempting to parse with pyparsing. The list always has a minimum of three items. From this I want pyparsing to generate three groups, the first of which contains all of the words upto the…
Jonathan Barber
  • 871
  • 1
  • 7
  • 10
4
votes
2 answers

Regular expression in regards to question mark "lazy" mode

I understand the ? mark here means "lazy". My question essentially is [0-9]{2}? vs [0-9]{2} Are they same? If so, why are we writing the former expression? Aren't lazy mode more expensive performance wise? If not, can you tell the difference?
KTrover
  • 43
  • 1
  • 4
4
votes
2 answers

Making range of characters not greedy in Regex

I have a list of messages where I'm searching in that message for 4 or 3 digit number, I then replace it with that number. So my current Regex is Find (.*)([0-9]{3,4})(.*)\r Replace \2 However, the problem is with [0-9]{3,4} only takes the first…
4
votes
2 answers

Regex to match multiline string start with x, ends with y and contains z but not x in the middle

Better explain with an example. This is text:
  • hello THE WORDS
  • cruel
  • world THE WORDS
  • I want to find strings start with
  • and ends with
  • and contains THE WORDS. I am expecting to only match with
  • hello…
  • previous_developer
    • 10,579
    • 6
    • 41
    • 66
    4
    votes
    5 answers

    Extract data between square brackets "[]" using Perl

    I was using a regex for extracting data from curved brackets (or "parentheses") like extracting a,b from (a,b) as shown below. I have a file in which every line will be like this is the range of values (a1,b1) and [b1|a1] this is the range of…
    Naidu
    • 139
    • 1
    • 4
    • 13
    4
    votes
    5 answers

    Why is this non-greedy regex grabbing more than I want?

    I would think this should return "state,country" but it's returning "country" System.out.println("city,state,country".replaceAll("(.*,)?", "")); Why is it working this way, and how do I make it return "state,country". I want this answer as a…
    Daniel Kaplan
    • 62,768
    • 50
    • 234
    • 356
    4
    votes
    1 answer

    Python regex speed - Greedy vs. non-greedy

    I am making several regex substitutions in Python along the lines of \w\s+\w over many large documents. Obviously if I make the regex non-greedy (with a ?) it won't change what it matches (as \w != \s) but will it make the code run any…
    Barry
    • 167
    • 1
    • 10
    3
    votes
    4 answers

    Can't get Perl regex to be non-greedy

    My regex matches the last set of alpha characters in the line, regardless of what I do. I want it to match only the first occurrence. I have tried using the non-greedy operator, but it stubbornly matches the right-most set of alpha characters, in…
    3
    votes
    1 answer

    How do greedy / lazy (non-greedy) / possessive quantifiers work internally?

    I noticed that there are 3 different classes of quantifiers: greedy, lazy (i.e. non-greedy) and possessive. I know that, loosely speaking, greedy quantifiers try to get the longest match by first reading in the entire input string and then truncate…
    J-A-S
    • 368
    • 1
    • 8
    3
    votes
    3 answers

    Use sed (or similar) to remove anything between repeating patterns

    I'm essentially trying to "tidy" a lot of data in a CSV. I don't need any of the information that's in "quotes". Tried sed 's/".*"/""/' but it removes the commas if there's more than one section together. I would like to get from…
    materangai
    • 109
    • 1
    • 6
    3
    votes
    3 answers

    Regex to non-greedily match across multiple lines up to a line that starts with a specific string

    I am going to answer this myself, but this was giving me fits all day and although it is explained elsewhere, I thought I'd post it with my solution. I came across a situation where I needed to replace some text spanning multiple lines. It wasn't…
    Stonecraft
    • 860
    • 1
    • 12
    • 30
    3
    votes
    1 answer

    Prevent non-greedy part from consuming the following optional part

    I have a regex with a mandatory part, a non-greedy (lazy?) part, an optional part and finally another non-greedy part. Implemented as: ^mandatory.*?(:?optionalpart)?.*?$ The optionalpart consists of 'a…
    Mark Jeronimus
    • 9,278
    • 3
    • 37
    • 50
    3
    votes
    2 answers

    Python regex non-greedy acting like greedy

    I am working with transcripts and having trouble with matching patterns in non-greedy fashion. It is still grabbing way too much and looks like doing greedy matches. A transcript looks like this: >> John doe: Hello, I am John Doe. >> Hello, I am…
    ybcha204
    • 91
    • 3
    3
    votes
    3 answers

    Regex lazy on non-capturing group

    I have this regex: (?:(?:AND\sNOT|AND|OR)(?!.*(?:AND\sNOT|AND|OR))\s)(.*) What I want is to get the last key:value pair, example - k:v AND k1:v1 AND NOT k2:v2 OR k3:v3 I want the regex to match k3:v3, and it does, but it doesn't match the…
    Nadav Shabtai
    • 687
    • 1
    • 7
    • 16
    1 2
    3
    12 13