0

I'm using the Yelp API in R to pull down some businesses. From what I've read in the documentation you can pull down up to 20 businesses per API call, however if you use the offset= parameter you can essentially pull down more records.

What I'm trying to do is create a simple loop to create multiple API calls with an incrementing value for the offset= parameter.

For example -- the first API call would look like this:

yelpURL <- paste0("http://api.yelp.com/v2/search/?limit=20&offset=20&sort=0&term=food&location=Chicago")

The next call would have the offset=20, then 40, 60, 80, and so on. I'm not sure how to write this. I'd like to pull down the maximum number of businesses which is 1,000 I believe and have them added to a single data frame. Here's my full code below:

# yelp credentials
consumerKey = "xxxxxxx"
consumerSecret = "xxxxxxx"
token = "xxxxxxx"
tokenSecret = "xxxxxxx"

require(httr)
myApp <- oauth_app("YELP", key=consumerKey, secret=consumerSecret)
mySignature <- sign_oauth1.0(myApp, token=token, token_secret=tokenSecret)


yelpURL <- paste0("http://api.yelp.com/v2/search/?limit=20&offset=20&sort=0&term=food&location=Chicago")
locationData <- GET(yelpURL, mySignature)

require(jsonlite)
locationDataContent = content(locationData)
locationList=jsonlite::fromJSON(toJSON(locationDataContent))
results <- data.frame(locationList)
davids12
  • 323
  • 5
  • 18

1 Answers1

1

A general approach for your "query loop" could be to read those urls into a list, convert each json input into a data frame and finally merge all listed data frames to a combined data frame:

locationDataList.raw <- lapply(sprintf("http://api.yelp.com/v2/search/?limit=20&offset=%d&sort=0&term=food&location=Chicago", 
                                       seq(0, 60, 20)), 
                               GET, mySignature)
locationDataList <- lapply(locationDataList.raw, function(locationData) {
  locationDataContent = content(locationData)
  locationList=jsonlite::fromJSON(toJSON(locationDataContent))
  return(data.frame(locationList))
})
result <- do.call(rbind, locationDataList)

However, to have them "added into a single data frame" you will probably have to flatten/tidy your data before merging (rbind). E.g. select columns of interest. But that would be another story.

lukeA
  • 53,097
  • 5
  • 97
  • 100
  • Thanks. It looks like flattening up the data will be a bit of a pain. I suppose seperate data frames would be fine. How would you suggest I organize the results of each API call? – davids12 Feb 10 '15 at 14:33
  • Tbh, I'd save `locationDataList.raw` and `locationDataList`, put the file on dropbox or sth and post a new question which references the data and asks how to `rbind` the data frames. I stumbled about similar json problems, but never found a perfect solution. However, I'm sure the pros have got one. – lukeA Feb 10 '15 at 15:10