Hi I have a script to go through a notepad of numbers with a series of regex's. My regex's are working with the exception of a few values that are not showing up properly. For instance some number examples such as 11111-C00 or 22222-X01 they are returned only as 11111 and 22222 and are not including "-" and what is happening afterwards. I as well have a few cases that end in the format: number, letter number. These 2 regex's aren't giving me my desired outcome: d{4,5}-\w{1}\d{2} and \d{4}-\w\d{1}\w
Full Code:
import re
filename = 'Text.txt'
pattern = '\d{4,5}-\d{2,3}|\d{4,9}|\w{3}\d-\d{2}|\d{4,5}-\w{1}\d{2}|\b|\d{4}-\w\d{1}\w'
new_file = []
with open('Text.txt', 'r') as f:
lines = f.readlines()
for line in lines:
match = re.search(pattern, line)
if match:
new_line = match.group() + '\n'
print new_line
new_file.append(new_line)
with open('NewText.txt', 'w') as f:
f.seek(0)
f.writelines(new_file)
So all of my regex's are working fine except the last 2 (d{4,5}-\w{1}\d{2} and \d{4}-\w\d{1}\w) for patterns such as XXXXX-LXX and XXXXX-LXL where X is a number and L is a letter, they are only being returned as XXXX or XXXXX. Where am I going wrong?