getURL not working for one link (and it worked before)

Question

I used getURL and htmlTreeParse to do webscraping with the following code:

library(XML)
library(rvest)
library(httr)
library(RCurl)
url="https://www.restaurants.mcdonalds.fr/"

page = htmlTreeParse(getURL(url),useInternal = TRUE,encoding="utf8")
locs = unlist(xpathApply(page, '//div[@class="department-part"]/ul/li/a', 
   xmlGetAttr,"href"))

However, for some reasons, this code no longer works. And in getURL(url), it seems that I can get the whole source code.

url="https://www.restaurants.mcdonalds.fr/"
read_html(url) %>%
html_nodes(xpath='//div[@class="department-part"]/ul/li/a') %>%
  html_text()

I also tried rvest and it seems that read_html doesn't work either. Whereas I am able to view the source code, with Chrome for example.

I also tested another link.

url="https://restaurant.hippopotamus.fr/"
read_html(url) # works
getURL(url) # doesn't work and it did work before

How can I try to find a solution?

I get that the website isn't available from my location (UK). Any other examples you can give? — Chris, Aug 25 '18 at 12:06
@Chris, too bad, you can't look for a McDonald's restaurant in France then. :P Maybe `getURL("https://restaurant.hippopotamus.fr/")` ? — John Smith, Aug 25 '18 at 12:17
And `read_html("https://restaurant.hippopotamus.fr/")` from `rvest` works fine. — John Smith, Aug 25 '18 at 12:19
_"…As such, any reproduction, representation, use, adaptation, modification, incorporation, translation, commercialization, partial or complete, without the prior written authorization of GIE McDONALD'S FORCE, are prohibited;"_ / https://www.restaurants.mcdonalds.fr/mentions-legales?restaurantId= — hrbrmstr, Nov 17 '18 at 09:58

getURL not working for one link (and it worked before)

0 Answers0