I am trying to compute geo-distance based on the coordinates from the previous row. Is there a way to compute without adding extra columns to the data-frame?
Sample code:
import pandas
import geopy.distance
d = {'id_col':['A','B','C','D'],
'lat':[ 40.8397,40.7664,40.6845,40.6078],
'lon':[-104.9661,-104.999,-105.01,-105.003]
}
df = pandas.DataFrame(data=d)
First approach with lambda
and apply
df['geo_dist']=df.apply(lambda x: geopy.distance.geodesic((x['lat'],x['lon']),(x['lat'].shift(),x['lon']).shift()),axis=1)
I would get the error: AttributeError: ("'float' object has no attribute 'shift'", u'occurred at index 0')
And my second approach via calling a function on the dataframe:
def geodist(x):
return geopy.distance.geodesic((x['lat'],x['lon']),(x['lat'].shift(),x['lon']).shift())
df['geo_dist']=geodist(f)
In this case I would get the error:ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Any help is greatly appreciated.