2

This regex works in pythex, but not in python 3.6. I am not sure why:

Pythex link (click)

Code in python:

import re
test = '105297  003  002394  o  0000  20891  0.00  1'
pattern = r"(?P<pun1>\d{3})\s+(?P<pun2>\d{6})(\s+(?P<pun3>[01oO])(\s+(?P<pun4>\d{4}))?)?\s.*\s(?P<amt>\d+\.\d\d)\s"
match = re.match(pattern, test, re.IGNORECASE)
match is None
True

I haven't been able to figure out why it works in pythex but not in python interpreter.

Dnaiel
  • 7,622
  • 23
  • 67
  • 126

3 Answers3

3

You might be looking for re.search() not re.match(). The latter only matches at the start of the string (implies an anchor ^, that is):

match = re.search(pattern, test, re.IGNORECASE)
#            ^^^
if match:
    # change the world here

See a demo on regex101.com.

Jan
  • 42,290
  • 8
  • 54
  • 79
1

I suspect your problems comes from calling re.match instead of re.search. The re.search function tries to find the regex in the given string, while re.match requires the regex to match at the beginning of the string.

Change this:

match = re.match(pattern, test, re.IGNORECASE)

to this:

match = re.search(pattern, test, re.IGNORECASE)
aghast
  • 14,785
  • 3
  • 24
  • 56
1

The problem is that match() is used for matching the beginning of a string, not anywehere. from python docs: (Python docs for match())

"If zero or more characters at the beginning of string match this regular expression, return a corresponding match object."

You should use search() instead: "If you want to locate a match anywhere in string, use search() instead."

see also search() vs. match()

this part:

match = re.match(pattern, test, re.IGNORECASE)

has to be:

match = re.search(pattern, test, re.IGNORECASE)