0

I have following data frame and used a code from here

from geopy.geocoders import Nominatim

data = {'lat1': [116.51172,116.51135,116.51135,116.51627,116.47186],
        'lon1': [39.92123,39.93883,39.93883,39.91034,39.91248]}  
  
# Create DataFrame  
df_test = pd.DataFrame(data)  
geolocator = Nominatim(user_agent="geoapiExercises")
location = geolocator.reverse(df_test['lat']+","+ df_test['lon'], language='en')

address = location.raw['address']

df_test['suburb']= address.get('suburb', '')
df_test['postcode']= address.get('postcode', '')
df_test['road']= address.get('road', '')

I want to get 3 features from the location, however, got an error

ufunc 'add' did not contain a loop with signature matching types (dtype('<U32'), dtype('<U32')) -> dtype('<U32')

Could you help to get the necessary information?

John Mayer
  • 103
  • 7

1 Answers1

0

Use pd.apply to apply the geolocator.reverse function to each row of your dataframe.

geopy gave an error that the latitude coordinates must be in the range [-90, 90], so I added latitude and longitude normalization

import pandas as pd
from geopy.geocoders import Nominatim
from math import sin, asin, fmod, pi
geolocator = Nominatim(user_agent="geoapiExercises")

data = {'lat1': [116.51172,116.51135,116.51135,116.51627,116.47186],
        'lon1': [39.92123,39.93883,39.93883,39.91034,39.91248]}  
        
# Create DataFrame  
df = pd.DataFrame(data)  

# latitude and longitude normalization according to formulas found at
# https://stackoverflow.com/a/31119445/50065
df['lat'] = df['lat1'].apply(lambda lat: asin(sin((lat/180.0)*pi)) * (180.0/pi))
df['lon'] = df['lon1'].apply(lambda lon: fmod(lon - 180.0, 360.0) + 180.0)

df['lat_lon'] = df['lat'].astype(str) + ',' + df['lon'].astype(str)

df['location'] = df['lat_lon'].apply(lambda lat_lon:  geolocator.reverse(lat_lon, language='en'))

df['address'] = df['location'].apply(lambda loc: loc.raw['address'])

df['postcode'] = df['address'].apply(lambda addr: addr.get('postcode', 'no postcode'))

Output:

lat1 lon1 lat lon lat_lon location address postcode
0 116.512 39.9212 63.4883 39.9212 63.488280000000024,39.92123000000001 Обозерское городское поселение, Plesetsky District, Arkhangelsk Oblast, Northwestern Federal District, 164254, Russia {'municipality': 'Обозерское городское поселение', 'county': 'Plesetsky District', 'state': 'Arkhangelsk Oblast', 'region': 'Northwestern Federal District', 'postcode': '164254', 'country': 'Russia', 'country_code': 'ru'} 164254
1 116.511 39.9388 63.4887 39.9388 63.488650000000014,39.938829999999996 Обозерское городское поселение, Plesetsky District, Arkhangelsk Oblast, Northwestern Federal District, 164254, Russia {'municipality': 'Обозерское городское поселение', 'county': 'Plesetsky District', 'state': 'Arkhangelsk Oblast', 'region': 'Northwestern Federal District', 'postcode': '164254', 'country': 'Russia', 'country_code': 'ru'} 164254
2 116.511 39.9388 63.4887 39.9388 63.488650000000014,39.938829999999996 Обозерское городское поселение, Plesetsky District, Arkhangelsk Oblast, Northwestern Federal District, 164254, Russia {'municipality': 'Обозерское городское поселение', 'county': 'Plesetsky District', 'state': 'Arkhangelsk Oblast', 'region': 'Northwestern Federal District', 'postcode': '164254', 'country': 'Russia', 'country_code': 'ru'} 164254
3 116.516 39.9103 63.4837 39.9103 63.48373,39.91033999999999 Обозерское городское поселение, Plesetsky District, Arkhangelsk Oblast, Northwestern Federal District, 164254, Russia {'municipality': 'Обозерское городское поселение', 'county': 'Plesetsky District', 'state': 'Arkhangelsk Oblast', 'region': 'Northwestern Federal District', 'postcode': '164254', 'country': 'Russia', 'country_code': 'ru'} 164254
4 116.472 39.9125 63.5281 39.9125 63.52813999999999,39.912480000000016 Обозерское городское поселение, Plesetsky District, Arkhangelsk Oblast, Northwestern Federal District, 164254, Russia {'municipality': 'Обозерское городское поселение', 'county': 'Plesetsky District', 'state': 'Arkhangelsk Oblast', 'region': 'Northwestern Federal District', 'postcode': '164254', 'country': 'Russia', 'country_code': 'ru'} 164254
BioGeek
  • 21,897
  • 23
  • 83
  • 145
  • Thank you for this approach, however my points show China, while your Saudi Arabia. Why is the case? – John Mayer Apr 29 '21 at 13:03
  • Because you need to do proper [latitude normalization](https://stackoverflow.com/questions/50953375/longitude-formatting-scale-for-calculating-distance-with-geopy) instead of simply subtracting 90 like I did. – BioGeek Apr 29 '21 at 13:11
  • But do you know, whether is it possible to do without apply function. I have many rows, and the code seems to be slow even for 10000 rows – John Mayer Apr 29 '21 at 13:36
  • Note that according to the [Nominatim usage policy](https://operations.osmfoundation.org/policies/nominatim/), you should only make one request per second and no bulk geocoding of larga amounts of data. – BioGeek Apr 29 '21 at 13:40
  • I have added latitude and longitude normalization, but now the addresses are in Russia.... – BioGeek Apr 29 '21 at 13:44
  • Okey, thank you.But do you know, whether is it possible to do without apply function. I have many rows, and the code seems to be slow even for 10000 rows – John Mayer Apr 29 '21 at 14:02