2

I have a web page that is set up like this:

//a bunch of container divs....

            <a class="food cat2 isotope-item" href="#" style="position: absolute; left: 45px; top: 0px;">
              <div class="background"></div>
              <div class="image">
                <img src="/assets/score-images/cereal2.png" alt="">
              </div>
              <div class="score">1148</div>
              <div class="name">Cereal with Banana</div>
            </a>

            <a class="food cat1 isotope-item" href="#" style="position: absolute; left: 215px; top: 0px;">
              <div class="background"></div>
              <div class="image">
                <img src="/assets/score-images/burrito-all.png" alt="">
              </div>
              <div class="score">2257</div>
              <div class="name">Beef &amp; Cheese Burrito</div>
            </a>

   //hundreds more a tags....

          </div>

I'm running this code to extra the name and score of each "a" attribute.

 page = requests.get('http://www.eatlowcarbon.org/food-scores')
  from bs4 import BeautifulSoup
  soup = BeautifulSoup(page.content, 'html.parser')

  print('HEllO')
  foodDict = {}
  aTag = soup.findAll('a')

  for tag in aTag:
          print('HELLO 2')
          name = tag.find("div", {"class": "name"}).text
          score = tag.find("div", {"class": "score"}).text
          foodDict[name] = score
          print('hello')

Both print statements are successfully executed, and so the second one tells me that I've entered the for loop at least. However, I get the error,

File "scrapeRecipe.py", line 40, in <module>
    name = tag.find("div", {"class": "name"}).text
AttributeError: 'NoneType' object has no attribute 'text'

From this post, I'm assuming that my code doesn't find any div with a class type equal to "name", or "score" for that matter. I'm completely new to python. Does anyone have any advice?

Community
  • 1
  • 1
maddie
  • 1,854
  • 4
  • 30
  • 66

1 Answers1

2

The problem is not with your tag.find('div', ...), but rather your soup.findAll('a'). You are pulling every a tag, even those without child tags you are trying to pull data from

By the looks of what you are needing, you need to add a class to your findAll as well

aTag = soup.findAll('a', {'class': 'food'})
Wondercricket
  • 7,651
  • 2
  • 39
  • 58