5

I have a page with self-refreshing content (via WebSocket) like this one. While the content is constantly changing my firefox webdriver can only see the initial content. I could get the fresh one by refreshing the page by

   driver.navigate.refresh()

but this causes unnecessary traffic besides in the Firefox window the new content already appear.

My question is: Can I get the fresh html as I can observe in the Firefox window without reloading the whole page?

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
user92020
  • 113
  • 1
  • 7

1 Answers1

4

If the page contents change over a period of time, one option you could do is check the page source every n seconds. A simple way to do this would be to import time then use time.sleep(5) to wait for 5 seconds, then get the page source. You can also put it in a loop, and if the page contents have changed within the succeeding 5 second periods, then selenium should be able to get the updated page contents when you check. I haven't tested this, but feel free to check if it works for you.

EDIT: Added sample code. Make sure that you have marionette properly installed and configured. You can check my answer here if you are an ubuntu user (https://stackoverflow.com/a/39536091/6284629)

# this code would print the source of a page every second
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
import time

# side note, how to get marionette working for firefox:
# https://stackoverflow.com/a/39536091/6284629

capabilities = DesiredCapabilities.FIREFOX
capabilities["marionette"] = True
browser = webdriver.Firefox(capabilities=capabilities)

# load the page
browser.get("http://url-to-the-site.xyz")

while True:
    # print the page source
    print(browser.page_source)
    # wait for one second before looping to print the source again
    time.sleep(1)
B B
  • 1,116
  • 2
  • 8
  • 20
  • 1
    Alright, that is exactly what I want to do. The thing is, when I call driver.page_source or inspect a certain element the content would not change. I.e. the driver saves the initial html once and does not update. So the point is how to get the updated source? – user92020 Dec 11 '16 at 18:19
  • You're probably reusing the variable you stored the page_source in, that's why its showing the same value. Reassign the page_source to the same variable after waiting, or just call `browser.page_source` again to get the updated source for the page. I've edited my answer to show a working example. – B B Dec 12 '16 at 01:22
  • Great, I've added the "marionette" option and now it works just as intended! Thanks a lot! – user92020 Dec 12 '16 at 14:32