0

I am trying to scrape this website: http://weirwood-net.com/generalinfo.
There is a table with 1874 rows. I tried to scrape it. I got it, but only the 10 first rows. It comes from the lengthMenu in 10. The other possibilities are 25, 50, 100, all. I wanna scrape all the data.

I used this code with Rselenium :

<code>library(RSelenium)
appURL <- "http://weirwood-net.com/generalinfo"
pJS <- phantom()
Sys.sleep(5)
remDr <- remoteDriver()
remDr$open()
remDr$navigate(appURL)
webElem <- remDr$findElement("css selector", "#tablepress-8")
dd<-remDr$getPageSource()[[1]]
doc <- htmlParse(dd)
readHTMLTable(doc)
remDr$close()
pJS$stop()</code>

I know that the problem is with that part of the code html:

<code>
jQuery(document).ready(function($){
$('#tablepress-8').dataTable({"order":[],"orderClasses":false,"stripeClasses":['even','odd'],"pagingType":"simple","columnDefs": [ { "type": "formatted-num", "targets": [ 7, 8 ] } ],"lengthMenu":[[10,25,50,100,-1],[10,25,50,100,"All"]]}).columnFilter();
});</code>

Then, could you help to build a code in Rselenium in order to select the option "All" in the Length menu OR to make a loop to change every page of the table (with clicking in next below the table).

Dave2e
  • 22,192
  • 18
  • 42
  • 50
  • You can click the All option using ` remDr$findElement("xpath", "//option[text() = 'All']") $clickElement()` – jdharrison Aug 19 '16 at 16:34
  • I add it between these lines : remDr$navigate(appURL) dd<-remDr$getPageSource()[[1]] and it works. Thanks you. I close the question. – Quentin Mouton Aug 19 '16 at 17:02

1 Answers1

0

You can use the findElement method with an appropriate selector to pick the option you wish. You can then use the clickElement method to click it:

remDr$findElement("xpath", "//option[text() = 'All']")$clickElement()
jdharrison
  • 30,085
  • 4
  • 77
  • 89