1

txt file with 40000 lines. In each lines are comma seperated numbers. I want to remove a specific number in the lines 36000 to 39000. For example number 233. But i dont want to remove the string from number 23341.

Here is my code so far:

with open("example.txt","r") as file:
newline =[]
i = 0
for l in file.readlines():
    if i>=36000 and i<=39000:
         newline.append(word.replace("233",""))
    else:
         newline.append(word.replace("233","233"))
    i = i + 1

with open("example.txt","w") as file:
for line in newline:
    f.writelines(line)

Is there a more elegant way to solve this problem?

  • maybe you can use regex for it? import re with open("example.txt","r") as file: newline = [] i = 0 for l in file.readlines(): if i >= 36000 and i <= 39000: newline.append(re.sub(r"\b233\b", "", l)) else: newline.append(l) i += 1 with open("example.txt","w") as file: for line in newline: file.write(line) – Vagner Jan 11 '23 at 16:15
  • If your numbers are comma separated, is it possible to replace by "233,", comma included ? – thomask Jan 11 '23 at 16:16

2 Answers2

1

You may use a regex replacement here:

for line in file.readlines():
    if i >= 36000 and i <= 39000:
        line = re.sub(r',?\b233\b,?', ',', line).strip(',')
        newline.append(line)
    i = i + 1

The above regex logic targets specifically the value 233 as a CSV value. The pattern and replacement ensure that the resulting CSV has no empty values or dangling commas.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
1

Iterating over a large text file with appending each line into a list to further overwrite the whole file - is definitely inefficient approach, use fileinput module and precompiled (with re.compile) regex pattern instead:

import fileinput, re

with fileinput.input('example.txt', inplace=True, encoding='utf-8') as f:
    pat = re.compile(r'\b233\b')
    for i, line in enumerate(f):
        if i >= 36000 and i <= 39000:
            line = pat.sub('', line)
        print(line, end='')
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105