0

I'm wondering why captures isn't able to determine the capture groups when they get too complex.

I have created the following regex, and it only matches the first group.

(type)(((( |\t))|((\/\*(([^\*\/])+)?\*\/))|((\/\/(([^\n])+)?\n))))+(\&)?(\$|\_|[a-zA-Z])((\$|\_|[a-zA-Z0-9])+)?((((( |\t))|((\/\*(([^\*\/])+)?\*\/))|((\/\/(([^\n])+)?\n))))+)?(=)

When I change it to the following regex, the captures can be applied to the appropriate scopes in captures:

(type)(\s+)([a-zA-Z0-9]+)(\s+)(=)

Is this happening because it is a simplified regex, or because something is wrong with my capture groups?

Breakdown of regex

1: Find the type keyword

(type)

2: Whitespace or a comment

(((( |\t))|((\/\*(([^\*\/])+)?\*\/))|((\/\/(([^\n])+)?\n))))+

3: Variable name

(\&)?(\$|\_|[a-zA-Z])((\$|\_|[a-zA-Z0-9])+)?

4: Whitespace or a comment

(((( |\t))|((\/\*(([^\*\/])+)?\*\/))|((\/\/(([^\n])+)?\n))))+
  1. Equal sign
(=)

EDIT

I noticed my regex is a bit messy, so here's a better one that also doesn't work:

/(type)((\/\*([^\*\/]+)?\*\/)|(\/\/([^\n]+)?\n)| |\t)+((\$|\_|[a-zA-Z])((\$|\_|[a-zA-Z0-9])+)?)((\/\*([^\*\/]+)?\*\/)|(\/\/([^\n]+)?\n)| |\t)+(=)/
Stan Hurks
  • 1,844
  • 4
  • 14
  • 25
  • 2
    your regex can be simplified a lot, and make all groups that you don't want to capture `(?:)` non capture groups, all is done on a line string, so `\n` will never happen – rioV8 Mar 21 '22 at 18:59
  • @rioV8 Thank you, I changed the comment parts to `(?:)` non capture groups. I also discovered that `([a-zA-Z$_][a-zA-Z$_0-9]*)` works, but `([a-zA-Z$_]([a-zA-Z$_0-9]+)?)` doesn't. Is there a way to use parentheses in the capture groups to nest those groups? – Stan Hurks Mar 21 '22 at 19:34
  • why do you need a capture group on the 2nd to end character – rioV8 Mar 21 '22 at 20:05
  • @rioV8 The pattern is for variable names, which can not have numbers in the first character, but can have them in the following characters which are optional – Stan Hurks Mar 21 '22 at 20:42
  • yes I know but why make it a capture group: **2nd to end character** – rioV8 Mar 22 '22 at 00:58

0 Answers0