I'm trying to scrape reviews for a certain product on Amazon and export the result in CSV format. I've tried to embed for loop within a function but it kept failing. So I separate function and for loop to see the result and now I don't know how to combine the result of for loop from pages 1 to 10.
When running the script, it shows reviews by pages but when I save the result in CSV, the file has only those on page 10.
How can I combine the result of for loop and save it in CSV altogether?
#install.packages("tidyverse")
#install.packages("rvest")
#install.packages("xml2")
library(tidyverse)
library(rvest)
library(xml2)
#Product = LG OLED77C9PUB Alexa Built-in C9 Series 77" 4K Ultra HD Smart OLED TV (2019)
#ASIN = B07PQ98L9D
scrape_amazon <- function(ASIN, page_num){
url_reviews <- paste0("https://www.amazon.com/LG-OLED77C9PUB-Alexa-Built-Ultra/product-reviews/",ASIN,"/?pageNumber=",page_num)
doc <- read_html(url_reviews)
#Review Date
doc %>%
html_nodes("[data-hook='review-date']")%>%
html_text() -> review_data
#Review Title
doc %>%
html_nodes("[class='a-size-base a-link-normal review-title a-color-base review-title-content a-text-bold']")%>%
html_text() -> review_title
#Review Text
doc %>%
html_nodes("[class='a-size-base review-text review-text-content']")%>%
html_text() -> review_text
#Number of Stars in Review
doc %>%
html_nodes("[data-hook='review-star-rating']")%>%
html_text() -> review_star
#Return a tibble
tibble(review_data,
review_title,
review_text,
review_star,
page = page_num)%>%
return()
}
for (i in 1:10){
review_all <- scrape_amazon(ASIN = "B07PQ98L9D", page_num = i) %>%
print(review_all)
}
#save in csv
write.table(review_all, file= "C:/Users/path/review.csv")