import re, datetime
input_text = "hhhh ((44_-_44)) ggj ((2022_-_02_-_18 20:00 pm)) ((((2022_-_02_-_18 20:00 pm))) (2022_-_02_-_18 00:00 am)"
identify_dates_regex_00 = r"(?P<year>\d*)_-_(?P<month>\d{2})_-_(?P<startDay>\d{2})"
identify_time_regex = r"(?P<hh>\d{2}):(?P<mm>\d{2})[\s|]*(?P<am_or_pm>(?:am|pm))"
restructuring_structure_00 = "(" + r"\g<year>_-_\g<month>_-_\g<startDay>" + r" \g<hh>:\g<mm> \g<am_or_pm>" + ")"
input_text = re.sub("\(" + identify_dates_regex_00 + " " + identify_time_regex + "\)", restructuring_structure_00, input_text)
print(repr(input_text)) # --> output
This is the wrong output that I get:
'hhhh ((44_-_44)) ggj ((2022_-_02_-_18 20:00 pm)) ((((2022_-_02_-_18 20:00 pm))) (2022_-_02_-_18 00:00 am)'
This is the correct output, without the extra parentheses, that I get:
'hhhh ((44_-_44)) ggj (2022_-_02_-_18 20:00 pm) (2022_-_02_-_18 20:00 pm) (2022_-_02_-_18 00:00 am)'
I need it to remove the unnecessary parentheses if they have in the middle the structure of year_-_month_-_day hour:minute am or pm
, that in regex using capture groups can be written like this "(?P<year>\d*)_-_(?P<month>\d{2})_-_(?P<startDay>\d{2})" identify_time_regex = r"(?P<hh>\d{2}):(?P<mm>\d{2})[\s|]*(?P<am_or_pm>(?:am|pm))"
or with and without determining capturing groups, it could be written with simple regex (although we would lose the possibility of capturing the data) "\d*_-_\d{2}_-_\d{2} \d{2}:\d{2}[\s|]*[ap]m"