BeautifulSoup get_text from find_all

Question

This is my first work with web scraping. So far I am able to navigate and find the part of the HTML I want. I can print it as well. The problem is printing only the text, which will not work. I get following error, when trying it: AttributeError: 'ResultSet' object has no attribute 'get_text'

Here my code:

from bs4 import BeautifulSoup
import urllib

page = urllib.urlopen('some url')


soup = BeautifulSoup(page)
zeug = soup.find_all('div', attrs={'class': 'fm_linkeSpalte'}).get_text()


print zeug

amaslenn · Accepted Answer · 2014-02-25T04:45:25.990

28

find_all() returns an array of elements. You should go through all of them and select that one you are need. And than call get_text()

UPD
For example:

    for el in soup.find_all('div', attrs={'class': 'fm_linkeSpalte'}):
        print el.get_text()

But note that you may have more than one element.

edited Feb 25 '14 at 04:45

answered Feb 24 '14 at 20:02

amaslenn

797
9
16

what is "el" and how should I define it? – Rho Oct 29 '21 at 23:20
@user193938 it is defined in for loop, `find_all` returns an array of elements, so `el` contains one of them on each iteration. – amaslenn Oct 31 '21 at 16:34

score 3 · Answer 2 · edited Dec 05 '20 at 07:02

3

Try for inside the list for getting the data, like this:

zeug = [x.get_text() for x in soup.find_all('div', attrs={'class': 'fm_linkeSpalte'})]

edited Dec 05 '20 at 07:02

Pang

9,564
146
81
122

answered Dec 05 '20 at 06:10

Muhtadi

73
6

score 0 · Answer 3 · answered Jul 25 '19 at 16:14

I would close this issue for being a duplicate and link you to another I found that answers this question but I don't think I have the reputation needed to moderate... So...

Original Answer

Code for this:

for el in soup.findAll('div', attrs={'class': 'fm_linkeSpalte'}):
    print ''.join(el.findAll(text=True))

If a mod wants to close this question that would be helpful.

corund · Answer 4 · 2023-03-23T13:11:25.623

0

zeug = BeautifulSoup(str(soup.find_all('div', attrs={'class': 'fm_linkeSpalte'})),features="lxml").get_text()

# explanation:
type(soup) #type "soup" is "bs4.BeautifulSoup"
zh1=soup.find_all('div', attrs={'class': 'fm_linkeSpalte'})
type(zh1) #type "zh1" is "bs4.element.ResultSet"
# we need type of "zh1" "bs4.BeautifulSoup"
# next retype is not possible:
zh2 = BeautifulSoup(zh1) #this row is bug
# therefore now we retype "zh1" to string
zh2=str(zh1)
# and in next step we retype strng to "bs4.BeautifulSoup":
zh3=BeautifulSoup(zh2,features="lxml")
# the last step is:
zeug=zh3.get_text()
# theses all steps can be use in one line, see the first line

edited Mar 23 '23 at 13:11

answered Mar 23 '23 at 10:42

corund

1
1

As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Mar 28 '23 at 09:14

BeautifulSoup get_text from find_all

4 Answers4

Linked