I want to replace the text of matched re patterns in a string and can do this using re.sub()
. If I pass it a function as the repl
argument in the call it works as desired, as illustrated below:
from __future__ import print_function
import re
pattern = r'(?P<text>.*?)(?:<(?P<tag>\w+)>(?P<content>.*)</(?P=tag)>|$)'
my_str = "Here's some <first>sample stuff</first> in the " \
"<second>middle</second> of some other text."
def replace(m):
return ''.join(map(lambda v: v if v else '',
map(m.group, ('text', 'content'))))
cleaned = re.sub(pattern, replace, my_str)
print('cleaned: {!r}'.format(cleaned))
Output:
cleaned: "Here's some sample stuff in the middle of some other text."
However from the documentation it sounds like I should be able to get the same results by just passing it a replacement string with references to the named groups in it. However my attempt to do that didn't work because sometimes a group is unmatched and the value returned for it is None
(rather than an empty string ''
).
cleaned = re.sub(pattern, r'\g<text>\g<content>', my_str)
print('cleaned: {!r}'.format(cleaned))
Output:
Traceback (most recent call last):
File "test_resub.py", line 21, in <module>
cleaned = re.sub(pattern, r'\g<text>\g<content>', my_str)
File "C:\Python\lib\re.py", line 151, in sub
return _compile(pattern, flags).sub(repl, string, count)
File "C:\Python\lib\re.py", line 278, in filter
return sre_parse.expand_template(template, match)
File "C:\Python\lib\sre_parse.py", line 802, in expand_template
raise error, "unmatched group"
sre_constants.error: unmatched group
What am I doing wrong or not understanding?