9

This is my first work with web scraping. So far I am able to navigate and find the part of the HTML I want. I can print it as well. The problem is printing only the text, which will not work. I get following error, when trying it: AttributeError: 'ResultSet' object has no attribute 'get_text'

Here my code:

from bs4 import BeautifulSoup
import urllib

page = urllib.urlopen('some url')


soup = BeautifulSoup(page)
zeug = soup.find_all('div', attrs={'class': 'fm_linkeSpalte'}).get_text()


print zeug
Stedy
  • 7,359
  • 14
  • 57
  • 77
Krytos
  • 135
  • 1
  • 2
  • 4

4 Answers4

28

find_all() returns an array of elements. You should go through all of them and select that one you are need. And than call get_text()

UPD
For example:

    for el in soup.find_all('div', attrs={'class': 'fm_linkeSpalte'}):
        print el.get_text()

But note that you may have more than one element.

amaslenn
  • 797
  • 9
  • 16
3

Try for inside the list for getting the data, like this:

zeug = [x.get_text() for x in soup.find_all('div', attrs={'class': 'fm_linkeSpalte'})]
Pang
  • 9,564
  • 146
  • 81
  • 122
Muhtadi
  • 73
  • 6
0

I would close this issue for being a duplicate and link you to another I found that answers this question but I don't think I have the reputation needed to moderate... So...

Original Answer

Code for this:

for el in soup.findAll('div', attrs={'class': 'fm_linkeSpalte'}):
    print ''.join(el.findAll(text=True))

If a mod wants to close this question that would be helpful.

Noah Gary
  • 916
  • 12
  • 25
0
zeug = BeautifulSoup(str(soup.find_all('div', attrs={'class': 'fm_linkeSpalte'})),features="lxml").get_text()

# explanation:
type(soup) #type "soup" is "bs4.BeautifulSoup"
zh1=soup.find_all('div', attrs={'class': 'fm_linkeSpalte'})
type(zh1) #type "zh1" is "bs4.element.ResultSet"
# we need type of "zh1" "bs4.BeautifulSoup"
# next retype is not possible:
zh2 = BeautifulSoup(zh1) #this row is bug
# therefore now we retype "zh1" to string
zh2=str(zh1)
# and in next step we retype strng to "bs4.BeautifulSoup":
zh3=BeautifulSoup(zh2,features="lxml")
# the last step is:
zeug=zh3.get_text()
# theses all steps can be use in one line, see the first line
corund
  • 1
  • 1
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Mar 28 '23 at 09:14