0

I'm having some trouble using the gtrends r package. I'm using "R Studio Version 1.1.463", running the "R version 3.5.1.".

When I'm searching for a specific keyword, sometimes the historical series of the hits changes a lot. Here is an example:

library(gtrendsR)

cr_br_prev1<-gtrends(keyword = c("Previdência"), geo = "BR", time = "2015-01-01 2018-12-26", gprop = c("web"),category = 37)
head(cr_br_prev1$interest_over_time$hits)

cr_br_prev2<-gtrends(keyword = c("Previdência"), geo = "BR", time = "2015-01-01 2018-12-26", gprop = c("web"),category = 37)
head(cr_br_prev2$interest_over_time$hits)

cr_br_prev3<-gtrends(keyword = c("Previdência"), geo = "BR", time = "2015-01-01 2018-12-26", gprop = c("web"),category = 37)
head(cr_br_prev3$interest_over_time$hits)

The answer I get for this simple code is the following:

> library(gtrendsR)
> 
> cr_br_prev1<-gtrends(keyword = c("Previdência"), geo = "BR", time = "2015-01-01 2018-12-26", gprop = c("web"),category = 37)
> head(cr_br_prev1$interest_over_time$hits)
[1]  0  0 24 46 24 24
> 
> cr_br_prev2<-gtrends(keyword = c("Previdência"), geo = "BR", time = "2015-01-01 2018-12-26", gprop = c("web"),category = 37)
> head(cr_br_prev2$interest_over_time$hits)
[1]  0  0 24 46 24 24
> 
> cr_br_prev3<-gtrends(keyword = c("Previdência"), geo = "BR", time = "2015-01-01 2018-12-26", gprop = c("web"),category = 37)
> head(cr_br_prev3$interest_over_time$hits)
[1]  70  34  51 100  67  35

As you can see, the configuration of each search is exactly the same. But the hits series change at the third one (I showing just the first terms with the "head" function, but there is changes at the whole historical series of the "hits"). This is happening kind of randomly for others searchs I'm doing too, even asking for another kind of output, as the "interest_by_region$hits" option.

I searched on Google Trends website how the data is built, and I understand that the historical series of the "hits" could change, once the "hits" reveals the relative popularity or a keyword that is normalized in the range 0-100. But should the structure of the data change as at my example?

Am I losing something?

I aprecciate any help!

Thanks a lot!

BetoR
  • 1
  • 2

1 Answers1

1

I had the same concern like that. Your code literally has no flaw. It seems that Google randomly extracts a subset of the actual search frequencies to calculate index for faster response. However, a famous paper in Finance area (see p.1467 footnote 4 in Da et al.(2011) shows that this problem would be insignificant if you had longer time span. Honestly, this problem is serious when you only request a short time span. In your case, I still have no solution to it.

KevinJC
  • 11
  • 1