i am trying to scrap this page https://plus.google.com/115016587855962294424/about. Everything works fine but when i try to click show more to load more reviews nothing happens here is my code
final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_24);
page = webClient.getPage("https://plus.google.com/115016587855962294424/about");
assertEquals(200,page.getWebResponse().getStatusCode());
assertEquals("OK",page.getWebResponse().getStatusMessage());
System.out.println(page.getWebResponse().getStatusCode());
Clicking show more here
HtmlSpan advancedSearchAn = (HtmlSpan) page.getFirstByXPath("//*[@id=\"115016587855962294424-about-page\"]/div/div[1]/div/div/div[2]/div[3]/span[1]");
page = advancedSearchAn.click();
but nothing happens i even tried
// webClient.waitForBackgroundJavaScript(10 * 1000);
// webClient.setAjaxController(new NicelyResynchronizingAjaxController());
// webClient.setAjaxController(new AjaxController(){
// @Override
// public boolean processSynchron(HtmlPage page, WebRequest request, boolean async)
// {
// return true;
// }
// });
Any suggestions ?
UPDATE:
*i was adviced to modify the incoming JavaScript code by subclass HttpWebConnection and override getResponse() as:*
new WebConnectionWrapper(webClient) {
public WebResponse getResponse(WebRequest request) throws IOException {
// System.out.println("content");
WebResponse response = super.getResponse(request);
if (request.getUrl().toExternalForm().contains("https://plus.google.com/115016587855962294424/about")) {
String content = response.getContentAsString("UTF-8");
//change content -- what is need to be changed
System.out.println("content "+content);
WebResponseData data = new WebResponseData(content.getBytes("UTF-8"),
response.getStatusCode(), response.getStatusMessage(), response.getResponseHeaders());
response = new WebResponse(data, request, response.getLoadTime());
}
System.out.println("content "+response.getContentAsString());
return response;
}
Any suggestions on how this can be done exactly and whats needed to be modified, i tried the following API's htmlunit jsoup webharvest selenium