6

I am trying to Geocode a CSV file that contains the name of the location and a parsed out address which includes Address number, Street name, city, zip, country. I want to use GEOPY and ArcGIS Geocodes through Geopy.I wanted to create a code that loops through my csv of 5000+ entries and gives me the latitude and longitude in separate columns in my CSV. I want to use ArcGIS Geocoding service through Geopy. Can anyone provide me with a code to get started? Thanks!

Here is my script:

import csv
from geopy.geocoders import ArcGIS


geolocator = ArcGIS()     # here some parameters are needed

with open('C:/Users/v-albaut/Desktop/Test_Geo.csv', 'rb') as csvinput:
    with open('output.csv', 'w') as csvoutput:
        output_fieldnames = ['Name','Address', 'Latitude', 'Longitude']
        writer = csv.DictWriter(csvoutput, delimiter=',', fieldnames=output_fieldnames)
        reader = csv.DictReader(csvinput)

        for row in reader:
            # here you have to replace the dict item by your csv column names
            query = ','.join(str(x) for x in (row['Name'], row['Address']))
            Address, (latitude, longitude) = geolocator.geocode(query)

            # here is the writing section
            output_row = {}
            output_row['Name'] = Name
            output_row['Address'] = Address
            output_row['Latitude'] = Latitude
            output_row['Longitude'] =Longitude
            writer.writerow(output_row)
Kane Blueriver
  • 4,170
  • 4
  • 29
  • 48
Gonzalo68
  • 431
  • 4
  • 11
  • 22

2 Answers2

3

this is just a beggining, tell me if that helps. It does not write to the csv but I'll edit my answer later if you need that part also

import csv
from geopy.geocoders import ArcGIS

geolocator = ArcGIS() #here some parameters are needed

with open('C:/Users/v-albaut/Desktop/Test_Geo.csv', 'rb') as csvinput:
    with open('output.csv', 'w') as csvoutput:
       output_fieldnames = ['Name','Address', 'Latitude', 'Longitude']
       writer = csv.DictWriter(csvoutput, delimiter=',', fieldnames=output_fieldnames)
       reader = csv.DictReader(csvinput)
       for row in reader:
            #here you have to replace the dict item by your csv column names
            query = ','.join(str(x) for x in (row['Name'], row['Address']))

            try:
                address, (latitude, longitude) = geolocator.geocode(query)
            except:
                latitude = 'N/A'
                longitude = 'N/A'

            #here is the writing section
            output_row = {}
            output_row['Name'] = row['Name']
            output_row['Address'] = row['Address']
            output_row['Latitude'] = latitude
            output_row['Longitude'] = longitude
            writer.writerow(output_row)

doc:

Below the Radar
  • 7,321
  • 11
  • 63
  • 142
  • Hi if you can provide the writing part too that will be great! – Gonzalo68 Jul 06 '15 at 20:15
  • I have added the writing part, can you test please and tell me if it is working. Note that you will have to provide the right parameters to the ArcGIS geocoder object to make it works – Below the Radar Jul 07 '15 at 12:17
  • I am getting a key error when I have placed the right column name on the script. What should I do? – Gonzalo68 Jul 07 '15 at 16:21
  • Where the error occurs? Be sure that the first line of the csv contains the column names and letter case is correct – Below the Radar Jul 07 '15 at 16:32
  • so corrected that error now I am having errors saving the CSV in the output part is saying that it address,latitude and longitude are not defined. – Gonzalo68 Jul 07 '15 at 16:49
  • Also I checked if the geocoding is working. It looks like it is just printing the address with no geocode that provides the lat and long coordinates. – Gonzalo68 Jul 07 '15 at 16:55
  • can you edit your question and paste the entire script with the modifications you made please? – Below the Radar Jul 07 '15 at 18:44
  • I added my script above. Thanks – Gonzalo68 Jul 07 '15 at 20:06
  • I have edited my answer to correct minor errors in your script. However, if the geocoder do not return lat and lon, I suspect you provided wrong parameters in geolocator. Default parameters are `ArcGIS(username=None, password=None, referer=None, token_lifetime=60, scheme='https', timeout=1, proxies=None)` – Below the Radar Jul 08 '15 at 12:37
  • I was able to geocode but when doing a second run for another CSV I get this error. TypeError: 'NoneType' object is not iterable – Gonzalo68 Jul 08 '15 at 17:36
  • this address, (latitude, longitude) = geolocator.geocode(query) – Gonzalo68 Jul 08 '15 at 17:55
  • do a print query before this line to see if its well formed – Below the Radar Jul 08 '15 at 17:56
  • It is but it stops at the second address and goes into that error. How do I fix that? – Gonzalo68 Jul 08 '15 at 17:58
  • 1
    probably the geocoder cant geocode this address so use the try/except to mark this address lat/lon to 'N/A' or other values – Below the Radar Jul 08 '15 at 19:43
  • Is there a way to solve the same thing without using `arcgis` ? http://stackoverflow.com/questions/44061984/get-latitude-longitude-from-address-geopandas – Sitz Blogz May 19 '17 at 09:00
  • Yes, geopy includes geocoder classes for the OpenStreetMap Nominatim, ESRI ArcGIS, Google Geocoding API (V3), Baidu Maps, Bing Maps API, Mapzen Search, Yandex, IGN France, GeoNames, NaviData, OpenMapQuest, What3Words, OpenCage, SmartyStreets, geocoder.us, and GeocodeFarm geocoder services. The various geocoder classes are located in geopy.geocoders. – Below the Radar May 19 '17 at 12:02
3

I've been using this script to do some batch-geocoding from .csv. It requires that one column contain the complete text address that you wish to geocode, and that one column be titled 'UniqueID', which has a unique identifier for each item in the .csv. It will also print out a list of any addresses that it failed to geocode. It also does a quick check to see if the zip code might be incorrect/throwing off the geocoding:

def main(path, filename):
# path to where your .csv lives, and the name of the csv.
    import geopy
    from geopy.geocoders import ArcGIS
    import pandas as pd

    Target_Addresses = pd.read_csv(path+'\\'+filename)
    Target_Addresses['Lat'] = np.nan
    Target_Addresses['Long'] = np.nan
    Indexed_Targets = Target_Addresses.set_index('UniqueID')

    geolocator = ArcGIS() #some parameters here
    Fails = []
    for index, row in Indexed_Targets.iterrows():
        Address = row['Address']
        Result = geolocator.geocode(Address)
        if Result == None:
            Result = geolocator.geocode(Address[:-7])
            if Result == None:
                Fails.append[Address]
            else:
                Indexed_Targets.set_value(index, 'Lat', Result.latitude)
                Indexed_Targets.set_value(index, 'Long', Result.longitude)
        else:
            Indexed_Targets.set_value(index, 'Lat', Result.latitude)
            Indexed_Targets.set_value(index, 'Long', Result.longitude)
    for address in Fails:
        print address
    Indexed_Targets.to_csv(filename[:-4]+"_RESULTS.csv")

if __name__ == '__main__':
    main(path, filename) # whatever these are for you...

This will output a new csv with "_RESULTS" (e.g., an input of 'addresses.csv' will output 'addresses_RESULTS.csv') with two new columns for 'Lat' and 'Long'.

GabeFS
  • 76
  • 3
  • I tried to use this code. its just print the (address) the very first row address and no file "_RESULTS.csv" is created. Please help thanks – Sourabh Choudhary Jun 28 '16 at 06:40
  • @SourabhChoudhary can you give some more info on how your input addresses are organized? And what kind of error is it producing? – GabeFS Jun 29 '16 at 13:22