I'm trying to achieve the following replacement in python. Replace all html tags with {n}
& create a hash of [tag, {n}]
Original string -> "<h>
This is a string. </H><P>
This is another part. </P>
"
Replaced text -> "{0} This is a string. {1}{2} This is another part. {3}"
Here's my code. I've started with replacement but I'm stuck at the replacement logic as I cannot figure out the best way to replace each occurrence in a consecutive manner i.e with {0}, {1} and so on:
import re
text = "<h> This is a string. </H><p> This is another part. </P>"
num_mat = re.findall(r"(?:<(\/*)[a-zA-Z0-9]+>)",text)
print(str(len(num_mat)))
reg = re.compile(r"(?:<(\/*)[a-zA-Z0-9]+>)",re.VERBOSE)
phctr = 0
#for phctr in num_mat:
# phtxt = "{" + str(phctr) + "}"
phtxt = "{" + str(phctr) + "}"
newtext = re.sub(reg,phtxt,text)
print(newtext)
Can someone help with a better way of achieving this? Thank you!