I used getURL
and htmlTreeParse
to do webscraping with the following code:
library(XML)
library(rvest)
library(httr)
library(RCurl)
url="https://www.restaurants.mcdonalds.fr/"
page = htmlTreeParse(getURL(url),useInternal = TRUE,encoding="utf8")
locs = unlist(xpathApply(page, '//div[@class="department-part"]/ul/li/a',
xmlGetAttr,"href"))
However, for some reasons, this code no longer works. And in getURL(url)
, it seems that I can get the whole source code.
url="https://www.restaurants.mcdonalds.fr/"
read_html(url) %>%
html_nodes(xpath='//div[@class="department-part"]/ul/li/a') %>%
html_text()
I also tried rvest
and it seems that read_html
doesn't work either. Whereas I am able to view the source code, with Chrome for example.
I also tested another link.
url="https://restaurant.hippopotamus.fr/"
read_html(url) # works
getURL(url) # doesn't work and it did work before
How can I try to find a solution?