1

So i thought i know a bit of regex but it seems i found a case where my knowledge is at is end. Anyway i tried the following Regex Replace function: in cases of no match, $1 returns full line instead of null But the main difference is i want to not only replace the input with the match but also insert some characters inbetween the matches. Simply put i want to standardize the input to a certain pattern. The regex i want to match and capture specific parts of the input but not everything

^[\D]*(?P<from_day>(0?[1-9])|([12][0-9])|3[01])[\.\-\s,■]+(?P<from_month>(0?[1-9])|(1[0-2]))[\.\-\s,■]*(?P<until_day>(0?[1-9])|[12][0-9]|3[01])[\.\-\s,■]+(?P<until_month>(0?[1-9])|1[012])[\D]*$

the replacement string:

\g<from_day>.\g<from_month>-\g<until_day>.\g<until_month>

Input:

28.11 16.12
"13.01 23,09"
01.08.-31.12
"01.01,-51.12"
"01,01.-31,12."
01083112
1.02 - 4.3

Current output:

28.11-16.12.-.
13.01-23.09.-.
01.08-31.12.-.
.-..-.
01.01-31.12.-.
.-..-.
1.02-4.3.-.

Expected/desired:

28.11-16.12
13.01-23.09
01.08-31.12

01.01-31.12

1.02-4.3

https://regex101.com/r/M3arvW/1

Tollpatsch
  • 304
  • 4
  • 13
  • 1
    This question is tagged python, so please show us the python code. – John Gordon Jun 07 '23 at 17:40
  • So basically the regex is inside a yaml and gets loaded and handed to a the call function of a class: new_text = re.sub(self.regex, self.replacement, self.text) – Tollpatsch Jun 07 '23 at 17:43
  • Please [edit] your question to include the relevant code. It's always best to provide a [mre] – JRiggles Jun 07 '23 at 17:51

1 Answers1

2

You should change your regex to this:

^\D*(?P<from_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<from_month>1[0-2]|0?[1-9])[.\s,■-]*(?P<until_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<until_month>1[012]|0?[1-9]).*|.+

Updated RegEx Demo

This will take care of all the issues except when there is no match. For no match you should use a lambda function re.sub to replace with an empty string.

Python Code:

>>> import re
>>> arr = ['"01,01.-31,12."', '01083112', '1.02 - 4.3', '"01.01,-51.12"']
>>> rx = re.compile(r'^\D*(?P<from_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<from_month>1[0-2]|0?[1-9])[.\s,■-]*(?P<until_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<until_month>1[012]|0?[1-9]).*|.+')
>>> for i in arr: print (rx.sub(lambda m: m.group('from_day') + '.' + m.group('from_month') + '-' + m.group('until_day') + '.' + m.group('until_month') if m.group('from_day') else '', i))
...
01.01-31.12

1.02-4.3

anubhava
  • 761,203
  • 64
  • 569
  • 643