1

i have this text file

test.html

<html>
<body>
<table>
  <tr>
      <td id="A">A</td>
      <td id="B">B</td>
 </tr>
 <tr>
    <td id="C">C</td>
    <td id="D">D</td>
 </tr>
</table>
</html>
</body>

python file

f = open('test.html')
ans = "A"
line = f.readline()
    print(line)
    if ans == 'line':
      #change the row A to a dash: <td>-</td>
    line = f.readline()
f.close()

so what i want to do is to scan through the html file and when i find the column A i can change it into a dash and save the file i am a beginner in python and don't know much about handling file input and output Please Note: Without Libraries

c5564200
  • 141
  • 1
  • 3
  • 9

4 Answers4

3

Using Python without any libraries you can use the following code to replace a line that contains A with what you want, I just replaced the line with builtin function replace() with a string:

<td id="A">-</td>\n

Code:

ans = "A"
lines = []

#open file
with open(r'test.html', mode='r') as f:
    for line in f.readlines(): # iterate thru the lines
        if ans in line: # check if is in ans in line
            line = ans.replace(ans, '<td id="A">-</td>\n') # replace the line containing the and with the new line, you can change to what you want. 
        lines.append(line)

#write to a new file
with open(r'myfile.html', mode='w') as new_f:
    new_f.writelines(lines)

myfile.html content:

 <html>
     <body>
         <table>
             <tr>
                 <td id="A">-</td>
                 <td id="B">B</td>
             </tr>
             <tr>
                 <td id="C">C</td>
                 <td id="D">D</td>
             </tr>
         </table>
    </html>
</body>
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
n1tk
  • 2,406
  • 2
  • 21
  • 35
2

Try using BeautifulSoup:

from bs4 import BeautifulSoup

# Open test.html for reading
with open('test.html') as html_file:
    soup = BeautifulSoup(html_file.read(), features='html.parser')

    # Go through each 'A' tag and replace text with '-'
    for tag in soup.find_all(id='A'):
        tag.string.replace_with('-')

    # Store prettified version of modified html
    new_text = soup.prettify()

# Write new contents to test.html
with open('test.html', mode='w') as new_html_file:
    new_html_file.write(new_text)

Which gives the following test.html:

<html>
 <body>
  <table>
   <tr>
    <td id="A">
     -
    </td>
    <td id="B">
     B
    </td>
   </tr>
   <tr>
    <td id="C">
     C
    </td>
    <td id="D">
     D
    </td>
   </tr>
  </table>
 </body>
</html>
RoadRunner
  • 25,803
  • 6
  • 42
  • 75
1

As suggested by others, BeautifulSoup is for sure a very nice option, but given that you are a beginner, I would like to suggest to you this regex approach.

import re
fh= open('test.html')
content = fh.read()
content = content.replace(re.findall("<td id=\"A\">A</td>",content)[0],"<td id=\"A\">--</td>")
fh.close()
fh=open('test.html','w')
fh.write(content)

Or if you want a more efficient code in terms of space and you know file handling in python well, then you can look at this approach as well:

import re
fh = open("test.html",'r+')
while True:
    currpos= fh.tell()
    line = fh.readline()
    if re.findall("<td id=\"A\">A</td>",line):
         line = line.replace(re.findall("<td id=\"A\">A</td>",line)[0],"<td id=\"A\">--</td>")
         fh.seek(currpos)
         fh.writelines(line)
    if line == '':
        break
fh.close()
paradocslover
  • 2,932
  • 3
  • 18
  • 44
  • this is a good solution just one query what if i want this line to have varibles example : ` var_1 = 'A' ` `content = content.replace(re.findall("var_1 ",content)[0],"var_1 ")` how will i go about doing that – c5564200 Dec 13 '18 at 08:05
  • You can use string concatenation : `content = content.replace(re.findall(""+var_1+" ",content)[0],""+var_1 +"")` – paradocslover Dec 13 '18 at 09:30
0

You can use beautifulsoup or HTMLParser libs. beautifulsoup is a lot easier to use, though. You can read how to use it here: https://www.pythonforbeginners.com/beautifulsoup/python-beautifulsoup-basic

heniotierra
  • 108
  • 9