Here is how I figured out my problem
It was really a 2 Step problem
- Figuring out how to properly encode my query to be inserted into in the Curl Call
- Creating a function that made an API call based on a vector of dates and appended it to a data frame.
Here is how I did it.
library(tidyverse)
library(jsonlite)
library(urltools)
library(httr)
# Function For Pulling by Date
get_newsriver_bydate <- function(query, date_v){
#Being Kind to the free API - Shout out to Elia at Newsriver who has been ever patient
pb$tick()$print()
Sys.sleep(sample(seq(0.5, 2.5, 0.5), 1))
#This is where is used the URL encode package as suggested by quartin
url_base <- "https://api.newsriver.io/v2/search"
create_curl_call <- url_base %>%
param_set("query",url_encode(query)) %>%
param_set("sortBy", "_score") %>%
param_set("sortOrder", "DESC") %>%
param_set("limit", "100")
#I had most of this before however I changed my output to a tibble
#more versatile to work with
get_curl <- GET(create_curl_call, add_headers(Authorization = paste(api_key, sep = "")))
curl_to_json <- content(get_curl, as = "text", encoding = "UTF-8")
news_df <- fromJSON(curl_to_json, flatten = TRUE)
news_df$discoverDate <- as.Date(news_df$discoverDate)
as.tibble(news_df)
}
# Set Configration and Set API key
set_config(config(ssl_verifypeer = 0L))
api_key <- "mykey"
#Set my vector of Dates
dates1 <- seq(as.Date("2017-09-01"), as.Date("2017-10-01"), by = "days")
#Set up my progress bar
pb <- progress_estimated(length(dates1))
#Sprintf my query into a vector of queries based on date
query <- sprintf('text:"Canada" AND text:"Rocks" AND language:EN AND discoverDate:[%s TO %s]',dates1, dates1)
#Run the query and be patient
news_df <- map_df(query, get_newsriver_bydate, .id = "query")
So for my research method and how I came to solving these 2 problems
Quartin gave me a suggestion to look up urltools package https://cran.rstudio.com/web/packages/urltools/index.html - This package helps you encode and decode your URL and various other functions that are fast and vectorised. Next my issue was getting my query correct here I simply looked up the API documentation which I suggest anyone trying to pull from an API do. May sound like a no brainer but I didn't give it a full read before posting my question
Creating the function I used a number of previous answers to help build it however the below post helped the most
API Query for loop
This post helped me with the progress bar and the map function to get everything into one Data frame.
There may very well be a better answer but this works for me so far.