3

Need some help with regular expressions. I want to match some Roman numerals and replace them to arabic.

First of all if use (IX|IV|V?I{0,3}) to match roman numerals (from 1 to 9). Then i add some logic to either space (with some text before) or nothing (begin/end of string) with (?:^|\s)(?:\s|$)

So finaly i've (?:^|\s)(IX|IV|V?I{0,3})(?:\s|$)

It matches all this variants:

  1. some text VI
  2. IX here we are
  3. another III text

If i define dict with roman-arabic map {'iii': 3, 'IX': 9} - how to repalce matches with values from dict? Also it matches only first accur, i.e. in some V then III i get only V

Digital God
  • 471
  • 1
  • 5
  • 17

1 Answers1

2

Also it matches only first accur, i.e. in some V then III i get only V

I assume that you are using re.match or re.search which is only giving you one result. We will use re.sub to solve your main question so this won't be an issue. re.sub can take a callable. We replace any match with the corresponding value from your dictionary. Use

re.sub(your_regex, lambda m: your_dict[m.group(1)], your_string)

This assumes any possible match is in your dict. If not, use

re.sub(your_regex, lambda m: your_dict[m.group(1)] if m.group(1) in your_dict else m.group(1), your_string)
timgeb
  • 76,762
  • 20
  • 123
  • 145
  • It works, but i've mistake in regex. With roman numerals spaces are removed too. `star wars IV` is converted to `star wars4` or `some III text` to `some3text` – Digital God Jan 05 '16 at 16:59
  • @DigitalGod add optional spaces in your matching group 1, example: https://regex101.com/r/zA7lX5/1 – timgeb Jan 05 '16 at 17:00
  • I do not know much about regular expressions, but thanks for the help :) – Digital God Jan 05 '16 at 17:05
  • Your example doesn't work as expected. It fetches both `_III_` and `III`. – Digital God Jan 05 '16 at 17:20
  • Sorry my example was a bit rushed and I'm in a hurry. I answered assuming you had the regex figured out and wanted to know how to do the replacing. Maybe you should open a new question asking for the regex needed itself, or if you want to I will delete my answer. – timgeb Jan 05 '16 at 17:22
  • No problem. I'll create another question :) – Digital God Jan 05 '16 at 17:28