0

I pretend to be able to get all the reviews that users leave on Google Play about the apps. I have this code that they indicated there Web scraping in R through Google playstore . But the problem is that you only get the first 40 reviews. Is there a possibility to get all the comments of the app?

`` `

#Loading the rvest package
library(rvest)
library(magrittr) # for the '%>%' pipe symbols
library(RSelenium) # to get the loaded html of 

#Specifying the url for desired website to be scraped
url <- 'https://play.google.com/store/apps/details? 
id=com.phonegap.rxpal&hl=en_IN&showAllReviews=true'

# starting local RSelenium (this is the only way to start RSelenium that 
is working for me atm)
selCommand <- wdman::selenium(jvmargs = c("- 
Dwebdriver.chrome.verboseLogging=true"), retcommand = TRUE)
shell(selCommand, wait = FALSE, minimized = TRUE)
remDr <- remoteDriver(port = 4567L, browserName = "firefox")
remDr$open()

# go to website
remDr$navigate(url)

# get page source and save it as an html object with rvest
html_obj <- remDr$getPageSource(header = TRUE)[[1]] %>% read_html()

# 1) name field (assuming that with 'name' you refer to the name of the 
reviewer)
names <- html_obj %>% html_nodes(".kx8XBd .X43Kjb") %>% html_text()

# 2) How much star they got 
stars <- html_obj %>% html_nodes(".kx8XBd .nt2C1d [role='img']") %>% 
html_attr("aria-label")

# 3) review they wrote
reviews <- html_obj %>% html_nodes(".UD7Dzf") %>% html_text()

# create the df with all the info
review_data <- data.frame(names = names, stars = stars, reviews = reviews, 
stringsAsFactors = F)

`` `

David Perea
  • 139
  • 3
  • 12

1 Answers1

0

You can get all the reviews from the web store of GooglePlay.

If you scroll through the reviews, you can see the XHR request is sent to:

https://play.google.com/_/PlayStoreUi/data/batchexecute

With form-data:

f.req: [[["rYsCDe","[[\"com.playrix.homescapes\",7]]",null,"55"]]]
at: AK6RGVZ3iNlrXreguWd7VvQCzkyn:1572317616250

And params of:

rpcids=rYsCDe
f.sid=-3951426241423402754
bl=boq_playuiserver_20191023.08_p0
hl=en
authuser=0
soc-app=121
soc-platform=1
soc-device=1
_reqid=839222
rt=c

After playing around with different parameters, I find out many are optional, and the request can be simplified as:

form-data:

f.req: [[["UsvDTd","[null,null,[2, $sort,[$review_size,null,$page_token]],[$package_name,7]]",null,"generic"]]]

params:

hl=$review_language

The response is cryptic, but it's essentially JSON data with keys stripped, similar to protobuf, I wrote a parser for the response that translate it to regular dict object.

https://gist.github.com/xlrtx/af655f05700eb76bb29aec876493ed90

Kamoo
  • 832
  • 4
  • 11
  • With the code you have written on GitHub (https://gist.github.com/xlrtx/af655f05700eb76bb29aec876493ed90) Can I get all the app reviews on Google Play? I am trying to execute your code but I have problems with the "utils" package. This returns me: ModuleNotFoundError: No module named 'utils' – David Perea Nov 04 '19 at 17:49
  • @DavidPerea It is a logging utility, you can replace it with ‘logging’ module, make sure it’s running python 3.6 or above. The code is working as of date. – Kamoo Nov 04 '19 at 18:22
  • I don't understand then why the "utiils" module doesn't work for me. Is it a package that I must install beforehand or how do I have to make it work? I am using Python 3.7. I've searched a multitude of forums to see why it appears to me: ModuleNotFoundError: No module named 'utils'. But I can't find the solution. Let's see if you could help me that I find very interesting to be able to get all the Google Play reviews – David Perea Nov 05 '19 at 17:41
  • The ‘utils’ package is a utility module I wrote, but in this snippet I’m only using its logging function, so I’m not posting it in the code. You can just use Python default ‘logging’ module to replace that, or simply use ‘print’ function instead. – Kamoo Nov 05 '19 at 17:59
  • Forgive my ignorance in the matter. I'm more familiar with R instead of Python and I'm having a hard time applying the code. I do not get it. Which of the classes in your code, should I call to get all the reviews of an app? And what parameters should I indicate? I would need your help, since I find it very interesting to be able to get it. – David Perea Nov 06 '19 at 11:32
  • Getting the review is not a language-specific problem, you only need to understand how the their request work. – Kamoo Nov 06 '19 at 12:04
  • Would you mind giving an example of how to call a function to get the reviews of an app, please? Thank you very much for your help. – David Perea Nov 06 '19 at 12:09
  • Sorry to bother you again, could you tell me how many reviews you get from this app: mGol App with this web link: https://play.google.com/store/apps/details?id=mobi.caixaguissona.app&gl=ES Do you get the 158 reviews or only the 38 that appears on the screen? – David Perea Nov 06 '19 at 16:30