How to merge and normalize Google Trends datasets with R?

Question

My goal is to compare the interest over time regarding some animal groups, and, for this, I will utilize gtrendsR package. As I intend to include more than five groups in my research (i.e., more than five keywords in Google Trends), thus exceeding the search queries limit allowed by Google Trends, I must to perform different sets of searches, each containing five keywords, one of which needs the one with the peak value. These are my R commands, regarding the data collection (Lion got the highest peak value):

> library(gtrendsR)

> # Setting the search terms:

> keywords_1 <- c("Lion", "Butterfly", "Cockroach", "Parrot", "Ostrich")

> keywords_2 <- c("Lion", "Platypus", "Alligator", "Hyena", "Horse")

> country <- c('BR') #setting the geographic area (Brazil).

> time <- ("2011-01-01 2021-12-31") #setting the period.

> channel <- 'web' #setting the channels.
    
> Running the queries:

> data1 <- gtrends(keywords_1, gprop = channel, geo = country, time = time, category = 0)

> data2 <- gtrends(keywords_2, gprop = channel, geo = country, time = time, category = 0)

At this point, my question is how can I merge data1 and data2 into a single dataset. I know that all data must be normalized to the peak value, which is Lion; but how can I perform this normalization in R?

Not possible to combine these two objects and then normalize them since the denominator of the normalization process changes from `data1` to `data2`. — JdeMello, Sep 02 '22 at 03:15
I read in [this paper](https://www.cambridge.org/core/journals/environmental-conservation/article/google-trends-data-reveal-a-sharp-trend-teeth-and-claws-attract-more-interest-than-feathers-hooves-or-fins/9E6E9FBD2C99ED1F20E2A74069E5614C) this statment: "Therefore, it is possible to compare more than the five datasets allowed by Google Trends if the search query that contains the peak value is included in each dataset and then all data are normalized to this peak value." — Arthur Filipe da Silva, Sep 02 '22 at 03:26
It would be helpful if they demonstrated in the appendix their scaling normalization method. It is possible to combine the results if both peak and minimum coincide. In other words, if you use in your require "Lion" and the topic with the absolute minimum number of hits alongside with other 3 topics, then you should be able to combine the datasets. — JdeMello, Sep 02 '22 at 12:30
Yes, it would be very helpful if there was a supplementary material with this information. But, apparently, only the term with the highest volume of searches is enough to control the other terms. I post this question in the SO Portuguese version [here](https://pt.stackoverflow.com/questions/563586/como-unir-e-normalizar-dados-do-google-trends-no-r/563651#563651), which got a fine solution. — Arthur Filipe da Silva, Sep 02 '22 at 20:04
My source for the scaling formula can be found [here](https://stats.stackexchange.com/a/281164). If that's the formula, then you need the minimum value across topics. If the normalized formule is something like `(x_i / x(max)) * 100` then, yes, having the most searched topic for two queries is enough. — JdeMello, Sep 02 '22 at 20:44
When I looked at the Google Trends FAQ [here](https://support.google.com/trends/answer/4365533?hl=en#:~:text=How%20is%20Google%20Trends%20data%20normalized%3F), I think that the normalization of GT data fits well to the formula `(x_i / x(max)) * 100`. What do you think? Oh, and thanks for your help! — Arthur Filipe da Silva, Sep 03 '22 at 05:18
I honestly have no clue, it's appalling that Google does not provide a more detailed description of their methodology. — JdeMello, Sep 04 '22 at 03:09

How to merge and normalize Google Trends datasets with R?

0 Answers0