I am pretty new to Python and this is the most complex loop I have written so far... yeah.
Anyway, I have a batch of files in the same directory that I want to clean using BeautifulSoup and re-save. Here is the code I have written. It is running without an error but seems to be getting stuck on the first file and not doing anything. Here is my code. Thanks for taking a look.
import os
import errno
import urllib
from bs4 import BeautifulSoup
os.listdir("E:/2013 10Ks/10K")
folder = "E:/2013 10Ks/10K"
#the code is running but getting stuck/not responsive
for filename in os.listdir(folder):
f = open(filename,"r",encoding='utf-8')
soup = BeautifulSoup(f,"xml")
output=soup.get_text()
file = open(filename, "w", encoding='utf-8')
file.write(output)
file.close()
EDIT
I believe what is happening is the loop is applying these function to the filenames and not the actual files. I tried listing out some of the files in the directory and then looping and this code appears to work.
import os
import errno
import urllib
from bs4 import BeautifulSoup
x = ["D:/2013 10Ks/10K/3.txt",
"D:/2013 10Ks/10K/4.txt"]
for filename in x:
f = open(filename,"r",encoding='utf-8')
soup = BeautifulSoup(f,"xml")
output=soup.get_text()
file = open(filename, "w", encoding='utf-8')
file.write(output)
file.close()
I thought what I did before did the same thing but using a list of the files in directory. Thanks for taking a look.