26

Here are the cases. I'm looking for the following pattern in a log file.

All strings are in the form of AB_N or CDE_N. AB and CDE are fixed letters, followed by an underscore. N can be either 2 or 3 numbers.

I tried (AB|CDE)_\d{2,3} but that returns a group. I can't do \w{2,3}\d{2,3} because it has to be either AB or CDE and not AC or FEG. Omitting the parentheses breaks too. I am sure the solution is simple but I'm new to python and regex and can't figure this out.

ekad
  • 14,436
  • 26
  • 44
  • 46
pedram
  • 2,931
  • 3
  • 27
  • 43
  • You can wrap the entire thing in a group: `((AB|CDE)_\d{2,3})`, and the first group is `AB_123` and the second is just `AB`. – Andy Lester Dec 20 '12 at 04:07

2 Answers2

53

A ?: inside a parenthesis in a regex makes it non-capturing. Like so: (?:AB|CDE)_\d{2,3}

See docs here: http://docs.python.org/3/library/re.html About a third of the way through it goes over the non-capturing syntax.

mattmc3
  • 17,595
  • 7
  • 83
  • 103
4

The non-capturing group syntax is (?:...). So do (?:AB|CDE)_\d{2,3}. This is documented along with everything else.

BrenBarn
  • 242,874
  • 37
  • 412
  • 384