Supposing I have an html string like this:
<html>
<div id="d1">
Text 1
</div>
<div id="d2">
Text 2
<a href="http://my.url/">a url</a>
Text 2 continue
</div>
<div id="d3">
Text 3
</div>
</html>
I want to extract the content of d2
that is NOT wrapped by other tags, skipping a url
. In other words I want to get such result:
Text 2
Text 2 continue
Is there a way to do it with BeautifulSoup?
I tried this, but it is not correct:
soup = BeautifulSoup(html_doc, 'html.parser')
s = soup.find(id='d2').text
print(s)