groovy regular expression difficulty

Question

I have a string like this: 1R12 or 2EURO16.

First character is 1 or 2 (numeric)
Middle is a letter or a word (R,L,X,Y,B or EURO)
End is 10,12,14,16 (numeric)

What I tried is this:

(^1|2)(R|L|X|Y|B|EURO)(10|12|14|16$)

But this gives negative result. What would be a correct or possible regex?

Wiktor Stribiżew · Answer 1 · 2019-08-23T08:00:40.627

4

The (^1|2) matches 1 at the start of the string and 2 anywhere in a string. Similarly, (10|12|14|16$) matches 10, 12 and 14 anywhere inside a string and 16 at the end of the string.

You need to rearrange the anchors:

/^[12](?:[RLXYB]|EURO)(?:10|12|14|16)$/

See the regex graph:

Details

^ - start of string
[12] - 1 or 2
(?:[RLXYB]|EURO) - R, L, X, Y, B or EURO
(?:10|12|14|16) - 10, 12, 14 or 16
$ - end of string

NOTE: If you use ==~ operator in Groovy, you do not need anchors at all because ==~ requires a full string match:

println("1EURO16" ==~ /[12](?:[RLXYB]|EURO)(?:10|12|14|16)/) // => true
println("1EURO19" ==~ /[12](?:[RLXYB]|EURO)(?:10|12|14|16)/) // => false

See the Groovy demo.

edited Aug 23 '19 at 08:00

answered Aug 23 '19 at 06:50

Wiktor Stribiżew

607,720
39
448
563

Hi, this is purely inquisitive but may I ask why you chose to use non-capturing groups? According to regex101 it takes the same number of steps to perform the regex with or without those. Is it some sort of memory optimization? – MonkeyZeus Aug 23 '19 at 12:47
1

@MonkeyZeus See [Are non-capturing groups redundant?](https://stackoverflow.com/a/31500517/3832970). Actually, the rule of thumb is to use non-capturing groups when the texts they capture are not going to be retrieved, or used for in-pattern backreferencing. Just best practice. – Wiktor Stribiżew Aug 23 '19 at 12:54
Thank you for the link, I will try to regex more responsibly moving forward :-) – MonkeyZeus Aug 23 '19 at 13:05

groovy regular expression difficulty

1 Answers1