1

(Disclaimer: I'm a newbie, I'm sorry if this problem is really obvious)

Hello,

I build a little script in order to first find certain parts of HTML markup within a local file and then display the information without HTML tags.

I used bs4 and find_all / get_text for this. Take a look:

from bs4 import BeautifulSoup
with open("/Users/user1/Desktop/testdatapython.html") as fp:
    soup = BeautifulSoup(fp, "lxml")

titleResults = soup.find_all('span', attrs={'class':'caption-subject'})

firstResult = titleResults[0]

firstStripped = firstResult.get_text()

print(firstStripped)

This actually works so far. But I want to do this for all values of titleResults, not only the first value. But I can't process an array with get_text.

Which way would be best to accomplish this? The number of values for titleResults is always changing since the local html file is only a sample.

Thank you in advance!

P.S. I already looked up this related thread but it is not enough for understanding or solving the problem sadly:

BeautifulSoup get_text from find_all

Pascal L.
  • 13
  • 3

1 Answers1

1

find_all returns a list

for result in titleResults:
    stripped = result.get_text()
    print(stripped)
Bitto
  • 7,937
  • 1
  • 16
  • 38