I was using rvest
to scrape a website for a couple of interested info on the webpage. An example page is like this https://www.edsurge.com/product-reviews/mr-elmer-product/educator-reviews, and I wrote a function like this:
PRODUCT_NAME2 <- c()
REVIEW <- c()
USAGE <- c()
DF4 <- data.frame(matrix(ncol=3, nrow=0))
parse_review_page <- function(url) {
product_name2 <- read_html(url) %>%
html_nodes(".mb0 a") %>%
html_text()
review <- read_html(url) %>%
html_nodes(".review-ratings__text strong") %>%
html_text()
usage <- read_html(url) %>%
html_nodes("p:nth-child(3)") %>%
html_text()
row_num <- length(review)
product_name2 <- rep(product_name2, row_num)
PRODUCT_NAME2 <- c(PRODUCT_NAME2, product_name2)
print(length(PRODUCT_NAME2))
REVIEW <- c(REVIEW, review)
print(length(REVIEW))
USAGE <- c(USAGE, usage)
print(length(USAGE))
current_df2 <- data.frame(PRODUCT_NAME2, REVIEW, USAGE)
DF5 <<- rbind(DF4, current_df2)
return (DF5)
}
and I used this to put the result into a dataframe:
url_to_scrape <- c("https://www.edsurge.com/product-reviews/mr-elmer-
product/educator-reviews")
DF6 <- url_to_scrape %>% map_dfr(parse_review_page)
But the problem I'm encountering is that, as there are 100+ user reviews, the webpage would only show 30 user reviews. What could be more challenging is that the url won't change after clicking on the 'Load More' at the bottom of the page, so there is essentially no 2nd, 3rd ...page to scrape. Can anyone give a suggestion about how to resolve this issue so I could scrape all the review data by running the function I created, please?