2

I have collected data of different users' location from twitter. I am trying to plot those data in a map in R. The problem is users have given invalid/incorrect addresses which causes geocode function to fail. How can I avoid this failure? Is there any way to check for this error case and not proceed? For example the user location data is something like this for any file geocode9.csv.

available locations, Buffalo, New York, thsjf, Washington, USA Michigan, nkjnt, basketball, ejhrbvw

library(ggmap)
fileToLoad <- file.choose(new = TRUE)
origAddress <- read.csv(fileToLoad, stringsAsFactors = FALSE)
geocoded <- data.frame(stringsAsFactors = FALSE)
for(i in 1:nrow(origAddress))
{

  result <- geocode(origAddress$available_locations[i], output = "latlona", source = "google")
  origAddress$lon[i] <- as.numeric(result[1])
  origAddress$lat[i] <- as.numeric(result[2])
  origAddress$geoAddress[i] <- as.character(result[3])

}
write.csv(origAddress, "geocoded.csv", row.names=FALSE)

When the code runs through "thsjf" of the locations list, it throws an error. How can I get past this error? I want something like, if(false){ # do not run geocode function}

  • Can you somehow catch this error? The normal workflow for geocode is that users would only be allowed to submit addresses chosen from the geocode service itself. Hence, the problem you have could never really happen. – Tim Biegeleisen Jan 15 '18 at 01:56
  • @TimBiegeleisen That is what I am trying to do. I cannot catch this error. I have tried. if(geocode(origAddress$available_locations[i], output = "latlona", source = "google")){ //find the coordinates} . It gives error in this way too saying that the condition within if is not a logical value. – Redwan Khan Jan 15 '18 at 01:59
  • So you have tried using try catch? Google's geocode service would just return an empty JSON response (I think) for a bad address. – Tim Biegeleisen Jan 15 '18 at 02:02
  • Yes I used tryCatch. But that did not solve the problem. I think I found a way to get around this problem. The trick is to not write this line: origAddress$geoAddress[i] <- as.character(result[3]). I dont know why, but if I avoid this line the invalid addresses are written as NA in the processed file. Thanks for the suggestion though! :D – Redwan Khan Jan 15 '18 at 02:20

1 Answers1

0

I'm not sure how to geocode those addresses if they are actually wrong. How would the machine even figure it out if it was wrong? I think you need to get the addresses corrected, and THEN geocode everything. Here is some sample code.

#load ggmap
library(ggmap)

startTime <- Sys.time()

# Select the file from the file chooser
fileToLoad <- file.choose(new = TRUE)


# Read in the CSV data and store it in a variable 
origAddress <- read.csv(fileToLoad, stringsAsFactors = FALSE)


# Initialize the data frame
geocoded <- data.frame(stringsAsFactors = FALSE)


# Loop through the addresses to get the latitude and longitude of each address and add it to the
# origAddress data frame in new columns lat and lon
for(i in 1:nrow(origAddress))

{
# Print("Working...")
result <- geocode(origAddress$addresses[i], output = "latlona", source = "google")
origAddress$lon[i] <- as.numeric(result[1])
origAddress$lat[i] <- as.numeric(result[2])
origAddress$geoAddress[i] <- as.character(result[3])
}


# Write a CSV file containing origAddress to the working directory
write.csv(origAddress, "geocoded.csv", row.names=FALSE)

endTime <- Sys.time()
processingTime <- endTime - startTime
processingTime

Check this for more info.

http://www.storybench.org/geocode-csv-addresses-r/

ASH
  • 20,759
  • 19
  • 87
  • 200