I've tried different ways to scrape Answer1
and Answer2
from a website through BeautifulSoup, urllib and Selenium, but without success. Here's the simplified version:
<div class="div1">
<p class="p1"></p>
<p class="p2">
<span>Question1</span>
<strong>Answer1</strong>
<br>
<span>Question2</span>
<strong>Answer2</strong>
<br>
In selenium, I try to find Question1
, then go to its parent and scrape Answer1
. Below is the code I use, although it's not correct.
browser.find_elements_by_xpath("//span[contains(text(), 'Question1')]/parent::p/following::strong")
I believe bs is more efficient than selenium in this case. How would you do this in bs? Thanks!
Edit: @Juan's solution is perfect for my example. However, I realized it's inapplicable to the website https://finance.yahoo.com/quote/AAPL?p=AAPL . Can anyone shed some light on parsing Consumer Goods
and Electronic Equipment
from there? And would it be better to use urllib.requests instead? Thank you.