3

I have the following strings:

4563_1_some_data

The general pattern is

r'\d{1,5}_[1-4]_some_data

Note, that numbers before first underscore may be the same for different some_data
So the question is: how to get all possible variants of replacement using regex?
Desired output:

[4563_1_some_data, 4563_2_some_data, 4563_3_some_data, 4563_4_some_data]

My attempt:

[re.sub(r'_\d_', r'_[1-4]_', row).group() for row in data]

But it has result:

4563_[1-4]_some_data

Seems like ordinary replacement. How should I activate pattern replacement and get the list?

2 Answers2

2

You need to iterate over a range object to create your list.

  1. Create that range object: To do that you need a pattern to get the [1-4] part from your pattern.
  2. You'll need another pattern to replace the number in the actual data with the variable from range object.
import re

text = "4563_1_some_data"
original_pattern = r"\d{1,5}_[1-4]_some_data"

# A regex to get the `[1-4]` part.
find_numbers_pattern = r"(?<=_)\[(\d)\-(\d)\](?=_)"

# Get `1` and `4` to create a range object
a, b = map(int, re.search(find_numbers_pattern, original_pattern).groups())

# Use `count=` to only change the first match.
lst = [re.sub(r"(?<=_)\d(?=_)", str(i), text, count=1) for i in range(a, b + 1)]

print(lst)

output:

['4563_1_some_data', '4563_2_some_data', '4563_3_some_data', '4563_4_some_data']
(?<=_)\[(\d)\-(\d)\](?=_) explanation:

(?<=_): A positive lookbehind assertion to match _.

\[(\d)\-(\d)\]: to get the [1-4] for example.

(?=_): A positive lookahead assertion to find the _.

S.B
  • 13,077
  • 10
  • 22
  • 49
2

Regex replacement may not be the most appropriate tool for your requirement. Instead, consider using f strings as follows:

def repl(inp, nums):
    parts = re.split(r'_\d+_', inp)
    output = [parts[0] + f'_{x}_' + parts[1] for x in nums]
    return(output)

nums = [1, 2, 3, 4]
inp = '4563_1_some_data'
output = repl(inp, nums)
print(output)

# ['4563_1_some_data', '4563_2_some_data', '4563_3_some_data', '4563_4_some_data']
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360