4

I'm working on a Shiny app for R and I'm trying to use the RDSTK package to reverse geocode a list of lat/lon pairs and get the CITY from the json results and save it to a list. The workflow is:

  1. SQLDF to select all records within a date range.
  2. Reverse geocode records and add column to data frame with the specific city.
  3. Use SQLDF again to get counts by city.

I'm having a lot of trouble understanding how to take the JSON output, convert it to data frame, then cbind it back to the original data frame. Any help would be much appreciated! See below code for reference:

Data frame:

df <- data.frame(lat=c(34.048381, 37.757836, 40.729855, 42.356391),
             lon=c(-118.266164, -122.441033, -73.987921, -71.062307))

I was able to extract the city from the returned JSON list, but I can't for the life of me, figure out how to do it multiple times for a larger list of lat/lon pairs. Searching through stackoverflow mainly results in dstk outside of R.

My ideal output would be:

lat        lon           city
34.048381  -118.266164   Los Angeles
37.757836  -122.441033   San Francisco
40.729855  -73.987921    New York
42.356391  -71.062307    Boston

I've also tried this example: R: How to GeoCode a simple address using Data Science Toolbox though I can't seem to re-engineer it for coordinates2politics.

Any input?

Community
  • 1
  • 1
Bogdan Rau
  • 625
  • 5
  • 17

2 Answers2

5

FWIW, here's one simple alternative using the Google API:

library(ggmap)
res <- lapply(with(df, paste(lat, lon, sep = ",")), geocode, output = "more")
transform(df, city = sapply(res, "[[", "locality"))
# lat        lon          city
# 1 34.04838 -118.26616   los angeles
# 2 37.75784 -122.44103 san francisco
# 3 40.72986  -73.98792      new york
# 4 42.35639  -71.06231        boston
lukeA
  • 53,097
  • 5
  • 97
  • 100
  • Thanks lukeA! I did see that option, though I have hundreds of thousands of records so gmaps would shut me off very quickly. – Bogdan Rau Mar 23 '15 at 22:24
2

Sound cool. I've had some trouble with RDSTK lately... I'm assuming the stock server is no longer working for you, as the author's blog describes. Too bad.

Here are two workarounds. You might be able to take the original lat/lon pairs, using the city places file from tigerfile, and use %over% in the sp package and then pull the name from the returned shape. That should be faster than repeated calls to an API.

However, I've got the same need for an open geocoder in R, and there are a few options. Check out ggmap, referenced in LukeA's answer - can use DSTK (now defunct) and is a simple interface to the google API for just a few calls. Also see this fantstic post describing how to use the census bureau's geocoder API. Write a little wrapper function to handle the JSON and you're good to go. That code worked for me as of 1/1/2016.

Mike Dolan Fliss
  • 217
  • 2
  • 11
  • I appreciate that, Drew, thanks. I'll edit. Plus, I have a workable version using something other than RDSTK. – Mike Dolan Fliss Jan 01 '16 at 23:36
  • I voted this as an answer, although I've given up on RDSTK or other API-based reverse geocoders. I decided to just build one within R that uses TIGER files and runs as a cron job in R. Does a much better job when running in bulk. – Bogdan Rau Jan 03 '16 at 06:29
  • Great, glad you found something that worked. Can I ask: can you elaborate on your cron job solution? You're just rerunning the R code in batches over time? – Mike Dolan Fliss Jan 04 '16 at 15:09
  • @MikeDolanFliss Apologies...just saw this comment right now. Yes, I have a file that runs every x minutes that takes in new data, does the reverse geocoding, and appends it to a different data set. I used the %over% function in SP just as you suggested. This seemed to be the most elegant solution. – Bogdan Rau May 18 '16 at 15:24