As I suggested in my comment, you may be better off using the GH API from one of the R packages that implements it. However, if you are determined to build it from scratch, the following code:
- uses the built in JSON->R decoding that
httr
gives you for free
- checks for valid response codes
- accounts for potentially missing fields in the return value
- uses
data.table
for both efficiency and easier handling of data frame building
It also gives you progress bars for free with pbapply
.
library(httr)
library(data.table)
library(pbapply)
get_data <- function(start, end) {
base_url <- 'https://api.github.com/users/%d/repos'
pblapply(start:end, function(i) {
resp <- GET(sprintf(base_url, i))
warn_for_status(resp)
if (status_code(resp) == 200) {
dat <- content(resp, as="parsed")
data.table(name=sapply(dat, function(x) ifelse(is.null(x[["name"]]), NA, x[["name"]])),
language=sapply(dat, function(x) ifelse(is.null(x[["language"]]), NA, x[["language"]])))
} else {
data.table(language=NA, name=NA)
}
})
}
gh <- rbindlist(get_data(1, 6))
gh
## name language
## 1: python-youtube-library Python
## 2: t NA
## 3: dotfiles VimL
## 4: pair-box NA
## 5: 6.github.com JavaScript
## 6: AndAnd.Net C#
## 7: backbone-tunes JavaScript
## 8: battletower CoffeeScript
## 9: BeastMode Ruby
## 10: blurry_search.coffee JavaScript
## 11: bootstrap CSS
## 12: browser-deprecator JavaScript
## 13: classify.js JavaScript
## 14: cocoa-example Objective-C
## 15: Colander CoffeeScript
## 16: comic_reader.js JavaScript
## 17: crawl-tools Python
## 18: CS-Projects Python
## 19: cssfast CoffeeScript
## 20: danbooru Ruby
## 21: Dex CoffeeScript
## 22: dnode-ruby Ruby
## 23: domain-gen Ruby
## 24: domainatrix Ruby
## 25: Doodler Java
## 26: dotfiles VimL
## 27: dothis Ruby
## 28: elixir-web Elixir
## 29: faster_manga CoffeeScript
## 30: favmix Java
## 31: fluent Ruby
## 32: fluid-image-grid JavaScript
## 33: freeform Ruby
## 34: FreeYourCode Ruby
## name language
Go easy on the free API access. This code will warn you if it gets a 403
but keep processing (you can change that with a stop_for_status
vs warn_for_status
or just test and stop on your own). You'll end up with incorrect NA
s that way.
IMO it would be far more advantageous to use the authenticated API access.