Based on the above answer, I have the following code:
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium import webdriver
chrome_path = r"C:\scrape\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get("https://website.com/")
# Get current divs
messages = driver.find_elements_by_class_name('div_i_am_targeting')
# Print all messages
for message in messages:
print(message.text)
while True:
try:
# Wait up to minute for new message to appear
wait(driver, 60).until(lambda driver: driver.find_elements_by_class_name('div_i_am_targeting') != messages)
# Print new message
for message in [m.text for m in driver.find_elements_by_class_name('div_i_am_targeting') if m not in messages]:
print(message)
# Update list of messages
messages = driver.find_elements_by_class_name('div_i_am_targeting')
except:
# Break the loop in case no new messages after minute passed
print('No new messages')
break
Which works fine and captures all divs on the page as they appear, that match the class specified by div_i_am_targeting
The divs on this HTML page are generated dynamically and one div appears about once every second.
The actual structure on the page is like this:
<div class="div_i_am_targeting">
...
...
</div>
<div class="div_i_am_targeting">
...
...
</div>
<div class="div_i_am_targeting">
...
...
</div>
<div class="some_other_div">
...
...
</div>
<div class="div_i_am_targeting">
...
...
</div>
<div class="yet_another_div">
...
...
</div>
<div class="div_i_am_targeting">
...
...
</div>
Such that, in the dynamically created content there are other divs appearing between the div I am currently targeting.
The frequency of divs on the page is variable.
I couldn't find any related questions here, or examples in the documentation.
How can I modify the above code, so that it scrapes the value of more than one div, e.g. if I want to scrape all instances of div_i_am_targeting
and some_other_div
in the above example?