0

I would like to use structured queries to do geocoding in GeoPy, and I would like to run this on a large number of observations. I don't know how to do these queries using a pandas dataframe (or something that can be easily transformed to and from a pandas dataframe).

First, some set up:

from geopy.extra.rate_limiter import RateLimiter
from geopy.geocoders import Nominatim
Ngeolocator = Nominatim(user_agent="myGeocoder")
Ngeocode = RateLimiter(Ngeolocator.geocode, min_delay_seconds=1)

df = pandas.DataFrame(["Bob", "Joe", "Ed"])
df["CLEANtown"] = ['Harmony', 'Fargo', '']
df["CLEANcounty"] = ['', '', 'Traill']
df["CLEANstate"] = ['Minnesota', 'North Dakota', 'North Dakota']
df["full"]=['Harmony, Minnesota','Fargo, North Dakota','Traill County, North Dakota']
df.columns = ["name"] + list(df.columns[1:])

I know how to run a structured query on a single location by providing a dictionary. I.e.:

q={'city':'Harmony', 'county':'', 'state':'Minnesota'}
testN=Ngeocode(q,addressdetails=True)

And I know how to geocode from the dataframe simply using a single column populated with strings. I.e.:

df['easycode'] = df['full'].apply(lambda x: Ngeocode(x, language='en',addressdetails=True).raw)

But how do I turn the columns CLEANtown, CLEANcounty, and CLEANstate into dictionaries row by row, use those dictionaries as structured queries, and put the results back into the pandas dataframe?

Thank you!

user2905280
  • 47
  • 1
  • 7

1 Answers1

1

One way is to use apply method of a DataFrame instead of a Series. This would pass the whole row to the lambda. Example:

df["easycode"] = df.apply(
    lambda row: Ngeocode(
        {
            "city": row["CLEANtown"],
            "county": row["CLEANcounty"],
            "state": row["CLEANstate"],
        },
        language="en",
        addressdetails=True,
    ).raw,
    axis=1,
)

Similarly, if you wanted to make a single row of the dictionaries first, you could do:

df["full"] = df.apply(
    lambda row: {
        "city": row["CLEANtown"],
        "county": row["CLEANcounty"],
        "state": row["CLEANstate"],
    },
    axis=1,
)
df["easycode"] = df["full"].apply(
    lambda x: Ngeocode(
        x,
        language="en",
        addressdetails=True,
    ).raw
)
KostyaEsmukov
  • 848
  • 6
  • 11