Questions tagged [regex-group]

Regex groups are created by placing part of a regular expression inside parentheses. Groups allows to apply a quantifier to the entire group or to restrict alternation to part of the regex. Besides grouping part of a regular expression together, parentheses also create a numbered capturing group. It stores the part of the string matched by the part of the regular expression inside the parentheses.

The regex Set(Value)? matches Set or SetValue. In the first case, the first (and only) capturing group remains empty. In the second case, the first capturing group matches Value.

If capturing the match isn't needed, the regular expression can be optimized into Set(?:Value)?. The question mark and the colon after the opening parenthesis are the syntax that creates a non-capturing group.

The question mark after the opening bracket is unrelated to the question mark at the end of the regex. The final question mark is the quantifier that makes the previous token optional. This quantifier cannot appear after an opening parenthesis, because there is nothing to be made optional at the start of a group. Therefore, there is no ambiguity between the question mark as an operator to make a token optional and the question mark as part of the syntax for non-capturing groups.

2670 questions
7
votes
5 answers

Find out which group matches in Java regex without linear search?

I have some programmatically assembled huge regex, like this (A)|(B)|(C)|... Each sub-pattern is in its capturing group. When I get a match, how do I figure out which group matches without linearly testing each group(i) to see it returns a non-null…
Fortepianissimo
  • 3,317
  • 5
  • 21
  • 15
6
votes
1 answer

What is the position of an unmatched group in C++?

Let m be of type std::smatch . Suppose there is an unmatched group i. What is m.position(i) ? For that matter, what is m[i]? For example, consider std::regex re {"^(a+)|(b+)"}; string target="aa"; std::smatch…
kdog
  • 1,583
  • 16
  • 28
6
votes
3 answers

replace number greater than 5 digits in a text

a <- c("this is a number 9999333333 and i got 12344") How could i replace the number greater than 5 digits with the extra digits being "X" Expected Output: "this is a number 99993XXXXX and i got 12344" Code i tried: gsub("(.{5}).*", "X", a)
prog
  • 1,073
  • 5
  • 17
6
votes
2 answers

Remove spaces between single character in string

I was trying to remove duplicate words from a string in scala. I wrote a udf(code below) to remove duplicate words from string: val de_duplicate: UserDefinedFunction = udf ((value: String) => { if(value == "" | value == null){""} else…
Vaibhav
  • 338
  • 2
  • 13
6
votes
2 answers

Regex and proper capture using .matches .Concat in C#

I have the following regex: @"{thing:(?:((\w)\2*)([^}]*?))+}" I'm using it to find matches within a string: MatchCollection matches = regex.Matches(string); IEnumerable formatTokens = matches[0].Groups[3].Captures …
6
votes
1 answer

Why does .NET's regex engine behave so bizarrely when I omit the "else" from a conditional group?

Code: Match match = Regex.Match("abc", "(?(x)bx)"); Console.WriteLine("Success: {0}", match.Success); Console.WriteLine("Value: \"{0}\"", match.Value); Console.WriteLine("Index: {0}", match.Index); Output: Success: True Value: "" Index: 1 It seems…
Kendall Frey
  • 43,130
  • 20
  • 110
  • 148
6
votes
1 answer

Regex parsing from delimited string with sequential groups

I'm trying to parse out words from a delimited string, and have the capture groups in sequential order. for example dog.cat.chicken.horse.whale I know of ([^.]+) which can parse out each word but this puts every string in capture group 1. Match…
6
votes
3 answers

Extract groups matched regex to array in scala

I got this problem. I have a val line:String = "PE018201804527901" that matches with this regex : (.{2})(.{4})(.{9})(.{2}) I need to extract each group from the regex to an Array. The result would be: Array["PE", "0182","018045279","01"] I…
Will
  • 145
  • 2
  • 8
6
votes
2 answers

Regex not capturing matching in expected groups

I have been working on requirement and I need to create a regex on following string: startDate:[2016-10-12T12:23:23Z:2016-10-12T12:23:23Z] There can be many variations of this string as…
Vishal
  • 666
  • 1
  • 8
  • 30
6
votes
2 answers

Why sed doesn't print an optional group?

I have two strings, say foo_bar and foo_abc_bar. I would like to match both of them, and if the first one is matched I would like to emphasize it with = sign. So, my guess was: echo 'foo_abc_bar' | sed -r 's/(foo).*(abc)?.*(bar)/\1=\2=\3/g' >…
static
  • 8,126
  • 15
  • 63
  • 89
5
votes
4 answers

Regex with ? for a set of words

I want to have a regex for NAME;NAME;NAME and also for NAME;NAME;NAME;NAME where the fourth occurrence of NAME is optional. I have one regex as (.+);(.+);(.+) which matched the first pattern but not the second. I tried playing with ? but its not…
Sonali Gupta
  • 494
  • 1
  • 5
  • 20
5
votes
3 answers

Why does perl regex with /mg modifier match past end-of-line?

This is related to perl multiline regex to separate comments within paragraphs, but focuses exclusively on a single question of regex syntax. According to perlre: Modifiers, the /m regex modifier means Treat the string being matched against as…
Jacob Wegelin
  • 1,304
  • 11
  • 16
5
votes
1 answer

Regex Captures in Java like in C#

I have a to rewrite a part of an existing C#/.NET program using Java. I'm not that fluent in Java and am missing something handling regular expressions and just wanted to know if I'm missing something or if Java just doesn't provide such feature. I…
signpainter
  • 720
  • 1
  • 7
  • 22
5
votes
5 answers

Regular expressions in swift

I'm bit confused by NSRegularExpression in swift, can any one help me? task:1 given ("name","john","name of john") then I should get ["name","john","name of john"]. Here I should avoid the brackets. task:2 given ("name"," john","name of…
Damodar
  • 707
  • 2
  • 10
  • 23
5
votes
1 answer

RegEx for matching dates (Month Day, Year OR m/d/yy)

I'm trying to write a regex expression that can be used to find dates in a string that may be preceded (or followed) by spaces, numbers, text, end-of-line, etc. The expression should handle US date formats that are either 1) Month Name Day, Year -…
TedS
  • 53
  • 1
  • 1
  • 5