regex in python not working - works in pythex but not in python 3.6

Question

This regex works in pythex, but not in python 3.6. I am not sure why:

Code in python:

import re
test = '105297  003  002394  o  0000  20891  0.00  1'
pattern = r"(?P<pun1>\d{3})\s+(?P<pun2>\d{6})(\s+(?P<pun3>[01oO])(\s+(?P<pun4>\d{4}))?)?\s.*\s(?P<amt>\d+\.\d\d)\s"
match = re.match(pattern, test, re.IGNORECASE)
match is None
True

I haven't been able to figure out why it works in pythex but not in python interpreter.

What does it suppose to do? – Igor Dragushhak Feb 13 '19 at 19:57 — Igor Dragushhak, Feb 13 '19 at 19:57

score 3 · Accepted Answer · answered Feb 13 '19 at 20:01

You might be looking for re.search() not re.match(). The latter only matches at the start of the string (implies an anchor ^, that is):

match = re.search(pattern, test, re.IGNORECASE)
#            ^^^
if match:
    # change the world here

See a demo on regex101.com.

score 1 · Answer 2 · answered Feb 13 '19 at 20:06

I suspect your problems comes from calling re.match instead of re.search. The re.search function tries to find the regex in the given string, while re.match requires the regex to match at the beginning of the string.

Change this:

match = re.match(pattern, test, re.IGNORECASE)

to this:

match = re.search(pattern, test, re.IGNORECASE)

score 1 · Answer 3 · answered Feb 13 '19 at 20:09

The problem is that match() is used for matching the beginning of a string, not anywehere. from python docs: (Python docs for match())

"If zero or more characters at the beginning of string match this regular expression, return a corresponding match object."

You should use search() instead: "If you want to locate a match anywhere in string, use search() instead."

regex in python not working - works in pythex but not in python 3.6

3 Answers3