0

I'm trying to collect the plain text/business title from the following: <div class = "business-detail-text> <h1 class = "business-title" style="position:relative;" itemprop="name">H&H Construction Co.</h1>

What is the best way to do this? The style & itemprop attribute's are where I get stuck. I know I can use soup.select but I've had no luck so far.

Here is my code so far:

def bbb_profiles(profile_urls):
    sauce_code = requests.get(profile_urls)
    plain_text = sauce_code.text
    soup = BeautifulSoup(plain_text, "html.parser")
    for profile_info in soup.findAll("h1", {"class": "business-title"}):
        print(profile_info.string)
n0de
  • 155
  • 4
  • 14
  • Does this answer your question? [jquery-like HTML parsing in Python?](https://stackoverflow.com/questions/3051295/jquery-like-html-parsing-in-python) – imbr Jun 17 '20 at 18:02

1 Answers1

1

is it what you need?

>>> from bs4 import BeautifulSoup
>>> txt='''<div class = "business-detail-text">
           <h1 class = "business-title" style="position:relative;" itemprop="name">H&H Construction Co.</h1></div>'''
>>> soup = BeautifulSoup(txt, "html.parser")
>>> soup.find_all('h1', 'business-title')
[<h1 class="business-title" itemprop="name" style="position:relative;">H&amp;H; Construction Co.</h1>]
>>> soup.find_all('h1', 'business-title')[0].text
u'H&H; Construction Co.'

I see your html is missing " after "business-detail-text and < /div> in the very end

Paweł Kordowski
  • 2,688
  • 1
  • 14
  • 21