3
import re

re_for_identificate_1 = r""

with open("data_path/filename_1.txt","r+") as file:
    for line in file:
        #replace with a substring adding a space in the middle
        line = re.sub(re_for_identificate_1, " milesimo", line)

        #replace in txt with the fixed line

Example filename_1.txt :

unmilesimo primero
1001°

dosmilesimos quinto
2005°

tresmilesimos
3000°

nuevemilesimos doceavo
9012°

The correct output file that I need obtiene is this:

Rewrited input filename_1.txt

un milesimo primero
1001°

dos milesimos quinto
2005°

tres milesimos
3000°

nueve milesimos doceavo
9012°

What is the regex that I need and what is the best way to replace the fixed línes in their original positions in the input file?

1 Answers1

4

You can use file.seek(0) to go beginning of the file, then write data and truncate the file. Like this:

import re

re_for_identificate_1 = "(?<!^)milesimo"

tmp = ""
with open("data.txt", "r+") as file:
    for line in file:
        line = re.sub(re_for_identificate_1, " milesimo", line)
        tmp += line
    file.seek(0)
    file.write(tmp)
    file.truncate()

The regex you want to use is "(?<!^)milesimo" to replace every instance of "milesimo" with " milesimo" but not at the beginning of a line.

Michael M.
  • 10,486
  • 9
  • 18
  • 34
  • I think the regex was blank on purpose -- the question was **What is the regex that I need** – Barmar Sep 22 '22 at 23:01
  • @Barmar Sorry, didn't catch that. – Michael M. Sep 22 '22 at 23:01
  • 1
    This doesn't even need to be a regexp, you can just use `line = line.replace('milesimo', ' milesimo')` – Barmar Sep 22 '22 at 23:02
  • @MichaelM I find a problem the the line is `"milesimo"` and this code put `" milesimo"` and it is not fine, because the space is not in the middle of 2 words, for example `"dos milesimos"`. Because trata is necessary the regex, the regex is the replacement condition – Santiago Nahuel Rodriguez Sep 22 '22 at 23:10
  • @Barmar The problem with use simple replacement is that you cant make the condition that the replacement is only when "milesimo" is after otero Word. – Santiago Nahuel Rodriguez Sep 22 '22 at 23:12
  • @SantiagoNahuelRodriguez I've updated my answer to fix that. Now the regex will only match if it is not at the beginning of the line. – Michael M. Sep 22 '22 at 23:15
  • 1
    It might be more appropriate to replace `(un|dos|tres|...)milesimos` with `$1 milesimos`. But then you need to enumerate all the possible numbers. – Barmar Sep 22 '22 at 23:23