0

It get stuck when Rselenium meets this URL, it will be all right if I change to some other webpage.as you can see the below code "getCurrentUrl"& "getPageSource" are the most basical operation.

url = "https://sycm.taobao.com/custom/login.htm?_target=http://sycm.taobao.com/"


# Build up the environment -----------------------------------------------------
library(RSelenium)
library(wdman)
pDrv <- phantomjs(port = 4567L)
remDr <- remoteDriver(browserName = "firefox", port = 4567L)
remDr$open()
remDr$navigate(url)


#show the page -----------------------------------------------------------------

remDr$maxWindowSize()
remDr$screenshot(display = TRUE)

# basic operation by seleniumR
remDr$getCurrentUrl()
remDr$getPageSource()[[1]]
Community
  • 1
  • 1

1 Answers1

0

Works with google chrome:

appUrl <- "https://sycm.taobao.com/custom/login.htm?_target=http://sycm.taobao.com/"


# Build up the environment -----------------------------------------------------
library(RSelenium)

rD <- rsDriver()
remDr <- rD$client
remDr$navigate(appUrl)


#show the page -----------------------------------------------------------------

remDr$screenshot(display = TRUE)

# basic operation by seleniumR
remDr$getCurrentUrl()
remDr$getPageSource()[[1]]

rm(rD)
gc()

And for Firefox:

appUrl <- "https://sycm.taobao.com/custom/login.htm?_target=http://sycm.taobao.com/"
# Build up the environment -----------------------------------------------------
library(RSelenium)

rD <- rsDriver(browser = "firefox")
remDr <- rD$client
remDr$navigate(appUrl)

#show the page -----------------------------------------------------------------

remDr$maxWindowSize()
remDr$screenshot(display = TRUE)

# basic operation by seleniumR
remDr$getCurrentUrl()
remDr$getPageSource()[[1]]
rm(rD)
gc()
jdharrison
  • 30,085
  • 4
  • 77
  • 89
  • If I use your way , this error shows again as below , I tried many times to solve it but failed then I choose another way to use Rselenium which I also found here . Please help if you are quite experienced about this . -------------------------------------------------------------------------------------------- rD <- rsDriver() checking Selenium Server versions: BEGIN: PREDOWNLOAD Error in open.connection(con, "rb") : Couldn't connect to server – jeremyparty Jun 25 '17 at 04:54
  • Hey . It still gets stuck when you come to remDr$getCurrentUrl() remDr$getPageSource()[[1]] . you just changed a browser . – jeremyparty Jul 13 '17 at 08:44
  • Seems to be an problem with phantomjs. Raise an issue on the project page https://github.com/ariya/phantomjs/issues – jdharrison Jul 13 '17 at 09:07
  • is it for the defense from the webpage , as the page is from Alibaba , they did something to detect thoes actions from machine . but its really a simple operation, I just want get the current URL(solving this I could do more). if I change the url I want to scrape, then it shows well . – jeremyparty Jul 19 '17 at 10:11