3

I am trying to get this particular code running for scrolling a webpage which is a type of pagination. It works like a charm with the Firefox driver, but when i use phantomJS it doesn't work and goes into infinite loop

public class Drivers {

public WebDriver phJS()
{
    File phantomjs = Phanbedder.unpack(); //Phanbedder to the rescue!

    String[] phantomArgs = new  String[] {
        "--webdriver-loglevel=NONE"
    };

    DesiredCapabilities dcaps = new DesiredCapabilities();
    dcaps.setCapability(PhantomJSDriverService.PHANTOMJS_EXECUTABLE_PATH_PROPERTY, phantomjs.getAbsolutePath());

    dcaps.setCapability( "phantomjs.cli.args", phantomArgs);
    WebDriver driver = new PhantomJSDriver(dcaps);
    phantomjs.delete();
    return driver;

}
public static void main(String args[]) throws IOException
{

    WebDriver wd=new FirefoxDriver();// Does Not work with new Drivers().phJS()

    wd.get("http://www.snapdeal.com/products/mobiles-mobile-phones/filters/Form_s~Smartphones#plrty|Brand:HTC|Ram_s:1%20GB^ 2%20GB^ 3%20GB^ 512%20MB%20and%20Below|Form_s:Smartphones|");
    wd= new PageScroll().scrollToBottom(wd);
    List<WebElement> wele = wd.findElements(By.xpath("//*[@class=' product-image ']/a"));
    for(WebElement we:wele)
    {
         System.out.println(we.getAttribute("href"));
    }
     wd.quit();
}

}

Here is the code which executes scrolling

public class PageScroll {
WebDriver driver;
 public WebDriver scrollToBottom(WebDriver driver) {
     String oldpage="";
     String newpage="";
     do{
         oldpage=driver.getPageSource();
        ((JavascriptExecutor) driver)
                .executeScript("window.scrollTo(0, document.body.scrollHeight)");

         newpage=driver.getPageSource();
        System.out.println(oldpage.equals(newpage));
     }while(!oldpage.equals(newpage));
        return driver;
    }

}

When i use PhantomJS it goes into infinite loop of do while, i do not understand why. Is it because the ajax script is not executed? but if so it should go out of the loop, and if its scrolling why doesnt it stops like firefox driver?

Afroz Shaikh
  • 362
  • 4
  • 18
  • What PhantomJS version do you use? Since this is an https site, the issue may be related to the poodle vulnerability. Have you taken a screenshot or looked at the page source to make sure that something is there? – Artjom B. May 26 '15 at 13:04
  • phantomjsdriver-1.0.1.jar,phanbedder-1.9.8-1.0.0.jar and selenium-java-2.45.0.jar – Afroz Shaikh May 27 '15 at 09:12

2 Answers2

1

Got the answer, I called an explicit wait. and it works fine

public synchronized WebDriver scrollToBottom(WebDriver driver, WebElement element,int time) throws InterruptedException {
     String oldpage="";
     String newpage="";


     do{
         oldpage=driver.getPageSource();
         ((JavascriptExecutor) driver)
                .executeScript("window.scrollTo(0, (document.body.scrollHeight))");
         this.wait(time);
         newpage=driver.getPageSource();
    }while(!oldpage.equals(newpage));
        return driver;
    }
Afroz Shaikh
  • 362
  • 4
  • 18
  • While this may seem to work, it's brittle. If the web site changes via a JavaScript timer, then your code will break again. Usually, it's better to identify an element or a change that must happen after scrolling and detect this specific change instead of blindly hoping that the DOM is still the same. – Aaron Digulla May 27 '15 at 09:57
  • I am trying to develop a generic crawler for eCommerce websites, I have handled click pagination since all i have to do is click the element till its a link, or present, here the problem is the div tags which changes from none-to-displayed-to-none, but it is getting harder to handle if it is meant to be a generic crawler. – Afroz Shaikh May 27 '15 at 10:03
  • You will need a set of scripts to be able to have special rules per site. That way, you can quickly change the script without having to recompile your code all the time. Also note that crawlers aren't welcome on many commercial sites. Make sure you obey the rules or they will eventually block you. – Aaron Digulla May 27 '15 at 10:07
  • {"errorMessage":"Refused to evaluate a string as JavaScript because 'unsafe-eval' is not an allowed source of script in the following Content Security Policy directive – Marek Bernád Nov 20 '17 at 11:33
0

LinkedIn is changing the page when you scroll to the bottom, requesting more data. That means you never get the same result after the scrolling.

I'm not sure why you don't see in Firefox; maybe it handles the scroll event after you call getPageSource() or getPageSource() returns stale data.

Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820
  • Ignore LinkedIn, the URL m trying is of snapdeal with lazy pagination. I am geting the proper output in Firefox but not in PhantomJs – Afroz Shaikh May 27 '15 at 09:46
  • In that case, you will have to run the code and examine the two strings yourself to see what has changed. All I can say from here is that scrolling the page does change the DOM on PhantomJS. This is unexpected but not impossible. – Aaron Digulla May 27 '15 at 09:47
  • got the answer by calling wait. – Afroz Shaikh May 27 '15 at 09:55