Finding multiple attributes within the span tag in Python

Question

There are two values that i am looking to scrape from a website. These are present in the following tags:

<span class="sp starBig">4.1</span>
<span class="sp starGryB">2.9</span>

I need the values sp starBig, sp starGryB.

The findAll expression that i am using is -

soup.findAll('span', {'class': ['sp starGryB', 'sp starBig']}):

The code gets executed without any errors yet no results get displayed.

@skyline75489 Sry.I am not sure which version it is. How do i find out? I am newbie. — RDPD, Apr 26 '15 at 13:54
@skyline75489 yes i need the values of sp starBig and sp starGryB. I am able to get those values when i use either sp starBig or sp starGryB, but not when i use both — RDPD, Apr 26 '15 at 13:56

famousgarkin · Accepted Answer · 2015-04-26T14:42:40.553

10

As per the docs, assuming Beautiful Soup 4, matching for multiple CSS classes with strings like 'sp starGryB' is brittle and should not be done:

soup.find_all('span', {'class': 'sp starGryB'})
# [<span class="sp starGryB">2.9</span>]
soup.find_all('span', {'class': 'starGryB sp'})
# []

CSS selectors should be used instead, like so:

soup.select('span.sp.starGryB')
# [<span class="sp starGryB">2.9</span>]
soup.select('span.starGryB.sp')
# [<span class="sp starGryB">2.9</span>]

In your case:

items = soup.select('span.sp.starGryB') + soup.select('span.sp.starBig')

or something more sophisticated like:

items = [i for s in ['span.sp.starGryB', 'span.sp.starBig'] for i in soup.select(s)]

edited Apr 26 '15 at 14:42

answered Apr 26 '15 at 13:23

famousgarkin

13,687
5
58
74

items = [i for s in ['span.sp.starGryB', 'span.sp.starBig'] for i in soup.select(s): try: print(i.string) except KeyError: pass – RDPD Apr 26 '15 at 14:00
items = soup.select('span.sp.starGryB') + soup.select('span.sp.starBig') is working. – RDPD Apr 26 '15 at 14:07
@Dixon The second option is just using a [list comprehension](https://docs.python.org/2/tutorial/datastructures.html#list-comprehensions), the expression inside and including `[]`, not a standard for loop. Removed the line split to hopefully improve clarity. – famousgarkin Apr 26 '15 at 14:42

score 2 · Answer 2 · answered Apr 26 '15 at 13:14

Probably there is a better way, but it is eluding me at present. It can be done with css selectors like this:

html = '''<span class="sp starBig">4.1</span>
          <span class="sp starGryB">2.9</span>
          <span class="sp starBig">22</span>'''

soup = bs4.BeautifulSoup(html)

selectors = ['span.sp.starBig', 'span.sp.starGryB']
result = []
for s in selectors:
    result.extend(soup.select(s))

score 0 · Answer 3 · answered Oct 08 '21 at 02:17

0

soup.findAll('span', {'class': ['sp starGryB', 'sp starBig']}) this code is helpful and it's work very good with me

answered Oct 08 '21 at 02:17

eslam kadry

1
1

1

Hi, how is this different from the code in the question? – no ai please Oct 08 '21 at 02:19
As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Oct 08 '21 at 03:35

Finding multiple attributes within the span tag in Python

3 Answers3

Linked

Related