2

I have been trying to get at a nested div and its contents but am not able to. I want to access the div with class:'box coursebox'. source code of required section of page

response = res.read()
soup = BeautifulSoup(response, "html.parser")    
div = soup.find_all('div', attrs={'class':'box coursebox'})

The above code gives a div with 0 elements, when there should be 8. find_all calls before this line work perfectly.

Thanks for helping!

Nils
  • 2,665
  • 15
  • 30

2 Answers2

1

In the case of attributes having more than one value, Beautiful Soup puts all the values into a list. In your code, you need to take this into account when you're doing your lookup.

Perhaps something like this?

div = soup.find_all('div', class_="box coursebox"})

Refer to this section of Beautiful Soup's documentation for more information on multi-valued attributes, and this section for details on looking elements up by class.

Also, please don't post source code as an image.

Erik
  • 957
  • 5
  • 13
  • Sorry for the image, I won't do it next time. Also, I have done exactly what you suggested in the source code above, please have a look at it. The image is of the required HTML of the web page, while the code written for scraping is below it. – Vaibhav Kulshrestha Feb 08 '17 at 15:26
0

change:

soup = BeautifulSoup(response, "html.parser")   

to:

soup = BeautifulSoup(response, "lxml")

html.parser is not stable, you can change it to lxml

宏杰李
  • 11,820
  • 2
  • 28
  • 35
  • FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library? _45cfeecd _dd3a8045 1 – Newbielp Aug 08 '23 at 08:52