2

Apparently there seems to be a problem with getURL when using windows 10. I have searched through the internet for a long time and cannot find any answers other than "use [alternative] instead". However, I'm currently taking a class with pre-coded algorithms and whenever I replace RCurl with, say, Curl, everything breaks down and I don't know R at all (I am a complete beginner) so I would really like to use getURL and not something else because I can't fix the remainder of the algorithm to work.

For instance, running this piece of code

theurl <- getURL("https://en.wikipedia.org/wiki/Opinion_polling_for_the_French_presidential_election,_2017"
                 ,.opts = list(ssl.verifypeer = FALSE) )

returns the error

Error in function (type, msg, asError = TRUE) : error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version

The next few lines of code I would like to execute are

theurl <- getURL("https://en.wikipedia.org/wiki/Opinion_polling_for_the_French_presidential_election,_2017"
                 ,.opts = list(ssl.verifypeer = FALSE) )
Data <- readHTMLTable(theurl, stringsAsFactors = FALSE, which=1)
Data
#regexpr(pattern="26_January_to_16_March_2017", text = theurl)

If I use any alternatives, as suggested in e.g. this question, then either I only change the first line removing getURL and using curl instead of RCurl but then readHTMLTable won't work, or I replace both the first and the second line but then apparently readHTMLTable and readLines don't do the same thing, so the rest of the algorithm fails or does not work properly. I can barely code in R and the algorithms I'm running have been coded by the professor teaching the course, so I can't easily fix this myself

So... Is there a way to get this getURL thing to work, so that I don't fail my semester? Thank you in advance for your help.

Azerty
  • 21
  • 2
  • txt = getURL("https://www.google.com") worked ok fo rme but when I replaced google with the site I actually wanted, I got the same error that you got. When I try wexists = url.exists("google.com") that returns TRUE however when I replace the url with the one I want, I get FALSE, even though the website is easily accessible in a browser. It seems to be an SSL problem with the target site but I've not progressed further. – jacanterbury Apr 28 '21 at 10:44

1 Answers1

2

Apologies in advance for my english. Try instead of a function getURL, use a function GET, don't forget to add the library httr.

library(httr)
url <- "url of website" 
Data <- GET(url)
Data <- readHTMLTable(rawToChar(tabs$content), stringsAsFactors = F) 

I also had a problem with the function getURL

  • Your answer solved it for me. But I do think you meant to write `Data <- readHTMLTable(rawToChar(Data$content), stringsAsFactors = F) `, or, alternatively, `tabs <- GET(url)`? – KeelyD Jul 28 '21 at 13:47