1

I'm using Capybara for web crawling, and I have the following challenge: after I interact with some DOM elements (e.g. click a button), I want to know (or make a good guess) if a new page is loading and if any AJAX requests are taking place. Because I'm crawling sites I don't necessarily control, I don't have access to the server-side state or know what to expect (i.e. it's not a matter of waiting for the page to load, it's a matter of knowing if it's happening at all).

The best case scenario would be if I could query a list of recent/ongoing/completed HTTP requests and get data from them.

Alternatively it would be nice if I could at least find out if the page is reloading/has reloaded since my last interaction.

At the very least I could check to see if the URL of the page I'm on matches the URL I used to be on, but this misses the AJAX requests, page refreshes, and doesn't wait for the page load to happen. Looking for something better than this.

I'm looking for something that works with selenium. For the non-AJAX case I would like it to work with webkit too. Any suggestions?

bchurchill
  • 1,410
  • 8
  • 23
  • possible duplicate of [Cucumber: Wait for ajax:success](http://stackoverflow.com/questions/7286254/cucumber-wait-for-ajaxsuccess) – Brad Werth May 10 '13 at 03:27
  • 1
    @BradWerth it doesn't sound as a dup to me – Andrei Botalov May 10 '13 at 22:23
  • 1
    See [Is there any way to log http requests/responses using Selenium Webdriver (firefox)?](http://stackoverflow.com/q/12034013/841064) You can try [BrowserMob-proxy](https://github.com/webmetrics/browsermob-proxy) – Andrei Botalov May 10 '13 at 22:30
  • Ok, so it sounds like the "bast case scenario" isn't possible without external help. What about just checking to see if it's busy loading the page? I know it's possible to ask it to wait until a page reloads, or some UI element changes, but that doesn't seem helpful unless you know what it is? – bchurchill May 11 '13 at 01:17

1 Answers1

1

Selenium doesn't provide an API to monitor HTTP traffic or to see if page is loading. If you need to log HTTP requests you should use a proxy like BrowserMob-proxy.

I think it may help to you that Selenium tries to block when page is loading but it doesn't happen in all circumstances (it may be better to try if Selenium blocks in yours).

If Selenium blocks in your circumstances you can measure time that it took for clicking link. If it took more than e.g. 0.1 second, it means that page was being loaded after the click.

require 'benchmark'
time = Benchmark.realtime { click_link 'Some link' }
if time > 0.1
  # Looks like page was being loaded after click
end

I don't know if Poltergeist is blocking or not.

Community
  • 1
  • 1
Andrei Botalov
  • 20,686
  • 11
  • 89
  • 123