0

I am trying to download data for an analysis from Google Trends by using gtrendsR. My keyword is the German word "Nachrichten" which equals the English term news. It is already running quite well, but unfortunately there are some problems when downloading the data. I have defined a variety of three-month periods in which the daily data should be downloaded. I also want to save each of these periods individually as a .csv file. In addition, all data should be merged and saved to one big file (here: trends_news). The problem is that individual three-month periods are always not downloaded and the error message: "Error message: NA/NaN argument" appears. The error is also not systematically always at a certain, identical time period, but varies if you use other search words (e.g. weather instead of news). But for my analysis I need all data between 2004-01-01 and 2012-06-30. Here is my code:

#define time periods for the dowload
time = c("2004-01-01 2004-03-31", "2004-04-01 2004-06-30", "2004-07-01 2004-09-30", "2004-10-01 2004-12-31", "2005-01-01 2005-03-31", "2005-04-01 2005-06-30", "2005-07-01 2005-09-30", "2005-10-01 2005-12-31", "2006-01-01 2006-03-31", "2006-04-01 2006-06-30", "2006-07-01 2006-09-30", "2006-10-01 2006-12-31", "2007-01-01 2007-03-31", "2007-04-01 2007-06-30", "2007-07-01 2007-09-30", "2007-10-01 2007-12-31", "2008-01-01 2008-03-31", "2008-04-01 2008-06-30", "2008-07-01 2008-09-30", "2008-10-01 2008-12-31","2009-01-01 2009-03-31", "2009-04-01 2009-06-30", "2009-07-01 2009-09-30", "2009-10-01 2009-12-31","2010-01-01 2010-03-31", "2010-04-01 2010-06-30", "2010-07-01 2010-09-30", "2010-10-01 2010-12-31", "2011-01-01 2011-03-31", "2011-04-01 2011-06-30", "2011-07-01 2011-09-30", "2011-10-01 2011-12-31", "2012-01-01 2012-03-31", "2012-04-01 2012-06-30") `
Sys.setenv(TZ = "Europe/Berlin")  # Set the timezone to 'Europe/Berlin'

#download data Nachrichten
trends_Nachrichten = data.table()

for (i in time) {
  
  tryCatch({
    trends <- gtrends(keyword = c("Nachrichten"), 
                      time = i,
                      geo = "DE",
                      gprop = "web",
                      category = 0,
                      hl = "de-DE")
    
    trends_data <- as.data.frame(trends$interest_over_time)
    trends_data$date <- as.Date(trends_data$date)
    
    file_name = paste0("Nachrichten", i, ".csv")
    write.csv(trends_data, 
              file = paste0('/Users/...', file_name), 
              quote = TRUE, 
              row.names = FALSE)
    
    trends_Nachrichten = rbind(trends_Nachrichten, trends_data)
  }, error = function(e) {
    cat("Error message:", conditionMessage(e), "\n")
  })
}

What could be the problem? Does anyone have a solution?

Thank you in advance!

I already searched online for reasons and solutions, but couldn't really find anything that helped me.

RPV
  • 1

1 Answers1

0

Well, it looks weird... It does seam like it could be a package problem or an API parsing problem. The code fails for some specific range of dates.

#define time periods for the dowload
l_time <- c("2004-01-01 2004-03-31", 
           "2004-04-01 2004-06-30", 
           "2004-07-01 2004-09-30", 
           "2004-10-01 2004-12-31", 
           "2005-01-01 2005-03-31", 
           "2005-04-01 2005-06-30", 
           "2005-07-01 2005-09-30", 
           "2005-10-01 2005-12-31", 
           "2006-01-01 2006-03-31",
           "2006-04-01 2006-06-30",
           "2006-07-01 2006-09-30", 
           "2006-10-01 2006-12-31", 
           "2007-01-01 2007-03-31",
           "2007-04-01 2007-06-30", 
           "2007-07-01 2007-09-30",
           "2007-10-01 2007-12-31",
           "2008-01-01 2008-03-31",
           "2008-04-01 2008-06-30",
           "2008-07-01 2008-09-30",
           "2008-10-01 2008-12-31",
           "2009-01-01 2009-03-31", 
           "2009-04-01 2009-06-30", 
           "2009-07-01 2009-09-30", 
           "2009-10-01 2009-12-31",
           "2010-01-01 2010-03-31",
           "2010-04-01 2010-06-30", 
           "2010-07-01 2010-09-30", 
       #"2010-10-01 2010-12-31", 
           "2011-01-01 2011-03-31", 
           "2011-04-01 2011-06-30", 
           "2011-07-01 2011-09-30",
           "2011-10-01 2011-12-31",
           "2012-01-01 2012-03-31", 
           "2012-04-01 2012-06-30")

I made a function to wrap your for and test.

f_gtrendsR <- function(v_time = "2004-01-01 2004-03-31",
                       v_keyword = "Nachrichten"){
    print(v_time)
    trends <- gtrendsR::gtrends(keyword = v_keyword, 
                      time = v_time,
                      geo = "DE",
                      gprop = "web",
                      category = 0,
                      hl = "de-DE")
    
    k <- as.data.table(trends$interest_over_time)
    k$date <- as.Date(k$date)

    return(k)
}

And call the function with do.call and lapply to rbind all results.

k <- do.call(rbind, lapply(l_time, f_gtrendsR, v_keyword = "Nachrichten"))

This runs fine for me. But note, only does so without "2010-10-01 2010-12-31" !

Strangely, if I divided the problematic range it works fine as well:

k1 <- f_gtrendsR(v_time = "2010-10-01 2010-11-30", v_keyword = "Nachrichten")
k2 <- f_gtrendsR(v_time = "2010-11-01 2010-12-31", v_keyword = "Nachrichten")

So, it´s not an proper answer, but I guess it stirs you close to one. You could always do the manual job and change the problematic ranges, but I feel unhappy with such an approach.

Arthur Welle
  • 586
  • 5
  • 15
  • Hey, thanks a lot! It's exactly that time frame that doesn't work for me as well and I also thought about correcting it manually, but wanted to ask for help or an automatic solution first, as I feel unhappy with that solution too! – RPV May 15 '23 at 19:10