Answering your question directly, you can make two regexps for your "Day Month Year" and "Month Day Year" formats, then check them separately.
import datetime
# Make months using list comp
months_shrt = [datetime.date(1,m,1).strftime('%b') for m in range(1,13)]
months_long = [datetime.date(1,m,1).strftime('%B') for m in range(1,13)]
# Join together
months = months_shrt + months_long
months_or = f'({"|".join(months)})'
expr_dmy = '\d{1,3},? ' + months_or + ',? \d{4}'
expr_mdy = months_or + ',? \d{1,3},? \d{4}'
You can try both out and see which one matches. However, you'll still need to inspect it and convert it to your favourite flavour of date format.
Instead, I would advise not using regexp at all, and simply try different date formats.
str_a = ' ,'
str_b = ' ,'
base_fmts = [('%d', '%b', '%Y'),
('%d', '%B', '%Y'),
('%b', '%d', '%Y'),
('%B', '%d', '%Y')]
def my_formatter(s):
for o in base_fmts:
for i in range(2):
for j in range(2):
# Concatenate
fmt = f'{o[0]}{str_a[i]} '
fmt += f'{o[1]}{str_b[j]} '
fmt += f'{o[2]}'
try:
d = datetime.datetime.strptime(s, fmt)
except ValueError:
continue
else:
return d
The function above will take a string and return a datetime.datetime
object. You can use standard datetime.datetime
methods to get your day, month and year back.
>>> d = my_formatter('Jan 15, 2009')
>>> (d.month, d.day, d.year)
(1, 15, 2009)