1

what I have?

I got the following df:

    id       latitude        longitude
0    0       31.23333         34.78333
1    1         nan               nan
2    2       42.70704        -71.16311
.
.
.

what I what to do?

I want to add a column with the country name if the lat/long aren't nan:

    id       latitude        longitude     country 
0    0       31.23333         34.78333      Israel
1    1         nan               nan
2    2       42.70704        -71.16311        USA
.
.
.


what I have tried?

    df['country'] = ''

    for index, row in df.iterrows():    
        print((row['latitude'], row['longitude']))
        if math.isnan(row['latitude']) or math.isnan(row['longitude']):
             continue
        else:
             geolocator = Nominatim(user_agent="row_{}".format(index))
             location = geolocator.reverse((row['latitude'], row['longitude']))
             row['country'] = location.raw['address']['country']

what is the problem?

I am getting the following error:

requests.exceptions.SSLError: HTTPSConnectionPool(host='nominatim.openstreetmap.org', port=443): Max retries exceeded with url: /reverse?lat=31.23333&lon=34.78333&format=json&addressdetails=1 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1123)')))

how can I solve this problem?

Dean Taler
  • 737
  • 1
  • 10
  • 25

1 Answers1

0

geopy docs contain an example of usage with pandas: https://geopy.readthedocs.io/en/stable/#module-geopy.extra.rate_limiter

import math

import pandas as pd
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter

df = pd.DataFrame([
    {"id": 0, "latitude": 31.23333, "longitude": 34.78333},
    {"id": 1, "latitude": float("nan"), "longitude": float("nan")},
    {"id": 2, "latitude": 42.70704, "longitude": -71.16311},
])


geolocator = Nominatim(user_agent="specify_your_app_name_here")
reverse = RateLimiter(geolocator.reverse, min_delay_seconds=1)


def reverse_country(row):
    if math.isnan(row['latitude']) or math.isnan(row['longitude']):
        return ''
    else:
         location = reverse((row['latitude'], row['longitude']), language='en')
         if location is None:
            return ''
         return location.raw['address']['country']


df['country'] = df.apply(reverse_country, axis=1)

This would produce the following dataframe:

   id  latitude  longitude                   country
0   0  31.23333   34.78333                    Israel
1   1       NaN        NaN
2   2  42.70704  -71.16311  United States of America
KostyaEsmukov
  • 848
  • 6
  • 11