0

I want to download daily Google search data for multiple keywords using the gtrends package in r. I need search data for 30 keywords between 2004-18. Since Google allows to extract daily data only for 9 months at a time, I have to download the data 6 months at a time for each keyword . I also do some additional calculations for the 6-month data (see code below). After downloading the data for 6 months at a time, I want to combine the data to a one time series. After that, I want to omit NAs, regress on weekday dummies and keep the residual and finally scale the time series by its own standard deviation. In the end I would want to save the adjusted data as a vector with the name of the search term (see code below).

How do I create a loop which does the search and calculations for each search term separately and saves the adjusted data as a vector? I've tried to use different kinds of loops and apply functions, but do not understand how to use them with the gtrends package.

#define the keywords
keywords=c("Charity")

#set the geographic area: GB = Great Britain
country=c('GB')

#timeframe
time=("2004-01-01 2004-06-30")
#set channels 
channel='web'
trends = gtrends(keywords, gprop =channel,geo=country, time = time )
#select only interest over time 
time_trend=trends$interest_over_time
time_trend$hits[time_trend$hits=="0"]<-1
time_trend$change <- c(NA,diff(log(time_trend$hits)))
set1=time_trend[which(weekdays(as.Date(time_trend$date, format = "%m/%d/%Y"))
                 %in% c('Monday','Tuesday', 'Wednesday', 'Thursday', 'Friday')), ]

This goes on until set30, after which:

### Combine each 6 month data set ####

set <- rbind(set1,..,set30)

#omit NAs from the set
set <- na.omit(set)

# Regress on weekday and month dummies and keep the residual
set$weekday <- weekdays(set$date) #dummy for weekdays
weekday <- set$weekday

setti$month <- months(setti$date) #dummy for months
month <- set$month
mod <- lm(set$change~month+weekday)

#keep the residuals after the regression
set$residuals <- residuals(mod)

# Scale each by the time-series standard deviation #
sd <- sd(set$residuals)
set$adj_residuals=((set$residuals)/(sd))
adj_svi <- set$adj_residuals

# Save the deseasonalized and standardized ln daily change in keyword search volume as a vector

charity <- adj_svi
axor93
  • 1

1 Answers1

0

You can do this with lappy and a defined function

search6m=function(keywords,channel=channel,country=country,time=time){
  trends = gtrends(keywords, gprop =channel,geo=country, time = time )
#select only interest over time 
time_trend=trends$interest_over_time
time_trend$hits[time_trend$hits=="0"]<-1
time_trend$change <- c(NA,diff(log(time_trend$hits)))
set1=time_trend[which(weekdays(as.Date(time_trend$date, format = "%m/%d/%Y"))
                 %in% c('Monday','Tuesday', 'Wednesday', 'Thursday', 'Friday')), ]]
set1
}

# difine time search intervals 
stime="2004-01-02"
etime="2005-12-31"
times=seq.Date(as.Date(stime),as.Date(etime),by="6 months")
tims=sapply(1:(length(times)-1),function(z)paste(times[z],times[z+1],sep=" "))
# get data for each interval and use rbind to combine
set <- lapply(tims,function(zt)search6m(keywords,channel,country,time=zt))
set = do.call("rbind",set)

# do all the rest of your code
Robert
  • 5,038
  • 1
  • 25
  • 43
  • This does not work out since I have to do the rest of my code separately for each keyword as well. To put it simply, I need a way to run my whole code for 30 different keywords and have a vector "keywordname" <- adj_svi for each keyword as an output. – axor93 Nov 03 '19 at 14:53
  • Of course this is not the final answer! Have you tryed to test the code with keywords containing more than one word? Try ,and you will have all data for all keywords in set. Then you will need to iterate by set$keyword to do the rest. – Robert Nov 04 '19 at 10:26