1

I am pretty new to Python and this is the most complex loop I have written so far... yeah.

Anyway, I have a batch of files in the same directory that I want to clean using BeautifulSoup and re-save. Here is the code I have written. It is running without an error but seems to be getting stuck on the first file and not doing anything. Here is my code. Thanks for taking a look.

import os
import errno
import urllib
from bs4 import BeautifulSoup

os.listdir("E:/2013 10Ks/10K")

folder = "E:/2013 10Ks/10K"

#the code is running but getting stuck/not responsive

for filename in os.listdir(folder):
    f = open(filename,"r",encoding='utf-8')
    soup = BeautifulSoup(f,"xml")
    output=soup.get_text()
    file = open(filename, "w", encoding='utf-8')
    file.write(output)
    file.close()

EDIT

I believe what is happening is the loop is applying these function to the filenames and not the actual files. I tried listing out some of the files in the directory and then looping and this code appears to work.

import os
import errno
import urllib
from bs4 import BeautifulSoup

x = ["D:/2013 10Ks/10K/3.txt",
"D:/2013 10Ks/10K/4.txt"]


for filename in x:
    f = open(filename,"r",encoding='utf-8')
    soup = BeautifulSoup(f,"xml")
    output=soup.get_text()
    file = open(filename, "w", encoding='utf-8')
    file.write(output)
    file.close()

I thought what I did before did the same thing but using a list of the files in directory. Thanks for taking a look.

  • It doesn't look like anything in the code you've shown would cause it to hang. Is there any other code in your program? A while loop could cause it to hang, for example. – mechanical_meat Mar 27 '17 at 00:21
  • That is all the code. I think that what is happening is the loop is applying to the filename and not the actual file. If I list out some of the file names in the directory and then run the same loop it works (putting that revision in the edit). Thank you –  Mar 27 '17 at 13:52
  • You can post an answer to your own question and accept it. Good on you for working it out. – mechanical_meat Mar 27 '17 at 16:08

0 Answers0