I am using Selenium WebDriver for crawling a web site(only for example, I will be crawling other web sites too!) which has infinite scroll.
Problem statement:
Scroll down the infinite scroll page till the content stops loading using Selenium web driver.
My Approach: Currently I am doing this-
Step 1: Scroll to the page bottom
JavascriptExecutor js = (JavascriptExecutor) driver;
js.executeScript("javascript:window.onload=toBottom();"+
"function toBottom(){" +
"window.scrollTo(0,Math.max(document.documentElement.scrollHeight," +
"document.body.scrollHeight,document.documentElement.clientHeight));" +
"}");
Then I wait for some time to let the Ajax Request complete like this-
Step 2: Explicitly wait for Ajax request to be over
Thread.sleep(1000);
Then I give another java script to check if the page is scrollable
Step 3:Check if the page is scrollable
//Alternative to document.height is to be used which is document.body.clientHeight
//refer to https://developer.mozilla.org/en-US/docs/DOM/document.height
if((Long)js.executeScript("return " +
"(document.body.clientHeight-(window.pageYOffset + window.innerHeight))")>0)
If the above condition is true then I repeat the from Step 1 - 3, till condition in Step 3 is false.
The Problem:
I do not want to give the Thread.sleep(1000);
in step 2, rather I would like to check using Java Script if the background Ajax request is over and then scroll down further if the condition in Step 3 is true .
PS: I am not the developer of the page so I do not have access to the code running the page, I can just inject java scripts(as in Step 1 and 3) in the web page. And, I have to write a generic logic for any web site with Ajax requests during infinite scroll.
I will be grateful to some one could spare some time here!
EDIT : Ok, after struggling for 2 days, I have figured out that the pages which I am crawling through the Selenium WebDriver can have any of these JavaScript libraries and I will have to pool according to the different Library, for example, In case of the web application using jQuery api, I may be waiting for
(Long)((JavascriptExecutor)driver).executeScript("return jQuery.active")
to return a zero.
Likewise if the web application is using the Prototype JavaScript library I will have to wait for
(Long)((JavascriptExecutor)driver).executeScript("return Ajax.activeRequestCount")
to return a zero.
Now, the problem is how do I write a generic code which could handle most the JavaScript libraries available?
Problem I am facing in implementing this-
1. How do I find which JavaScript Library is being used in the Web Application(using Selenium WebDriver in Java), such that I can then write the corresponding wait methods? Currently, I am using this
2. This way I will have to write as many as 77 methods for separate JavaScript library so, I need a better way to handle this scenario as well.
In short, I need to figure out if the browser is making any call(Ajax or simple) with or without any JavaScript library through Selenium Web Driver's java implementation
PS: there are Add ons for Chorme's JavaScript Lib detector and Firefox's JavaScript Library detector which detect the JavaScript library being used.