I'm trying to scrape app reviews from the play and app stores (app name, rating, review text in full, username) and am running into some issues. I read this post but ran into a lot of difficulties with RSelenium so I'm wondering if I can do it a simpler way. When using the XPath I am able to get the name of the app, but not the review text or ratings. I'm getting "character(0)" for the User and Review data. Another question I have is that on the Play Store to see more reviews you have to click Read More and I'm wondering if the scraping will stop at what is loaded on the page, and if so how to get the full set of reviews.
I have 0 experience with web scraping before today, so sorry if this is obvious.
library(rvest)
library(RSelenium)
library(xml2)
library(stringr)
url <- 'https://play.google.com/store/apps/details?id=com.woebot&hl=en_US'
webpage <- read_html(url)
Name_data_html <- webpage %>% html_nodes(xpath='/html/body/div[1]/div[4]/c-wiz/div/div[2]/div/div[1]/div/c-wiz[1]/c-wiz[1]/div/div[2]/div/div[1]/c-wiz[1]/h1/span')
Name_data <- html_text(Name_data_html)
head(Name_data)
User_data_html <- webpage %>% html_nodes(xpath='/html/body/div[1]/div[4]/c-wiz[3]/div/div[2]/div/div[1]/div/div/div[1]/div[2]/div/div[2]/div[1]/div[1]/span')
User_data <- html_text(User_data_html)
head(User_data)
Review_data_html <- webpage %>% html_nodes(xpath='/html/body/div[1]/div[4]/c-wiz[3]/div/div[2]/div/div[1]/div/div/div[1]/div[2]/div/div[2]/div[2]')
Review_data <- html_text(Review_data_html)
head(Review_data)
product_data <- data.frame(Name = Name_Data, User = User_data,Review=Review_data)
str(product_data)