0

I am using geopy distance.distance function to calculate distance between each latitude and longitude points in a gpx file like this:

lat lon alt time
0   44.565335   -123.312517 85.314  2020-09-07 14:00:01
1   44.565336   -123.312528 85.311  2020-09-07 14:00:02
2   44.565335   -123.312551 85.302  2020-09-07 14:00:03
3   44.565332   -123.312591 85.287  2020-09-07 14:00:04
4   44.565331   -123.312637 85.270  2020-09-07 14:00:05

I am using this code which creates new columns for lat and lon where the row is shifted down and then I can use apply to calculate the distance for each. This works, but I am wondering if there is a way to do it without creating additional columns for the shifted data.

def calcDistance(row):
    return distance.distance((row.lat_shift,row.lon_shift),(row.lat,row.lon)).miles

GPS_df['lat_shift']=GPS_df['lat'].shift()
GPS_df['lon_shift']=GPS_df['lon'].shift()
GPS_df['lat_shift'][0]=GPS_df['lat'][0]
GPS_df['lon_shift'][0]=GPS_df['lon'][0]
GPS_df['dist']= GPS_df.apply(calcDistance,axis=1)
tbone
  • 7
  • 1

1 Answers1

0

The code is efficient. One option is to remove the new columns once you get the distance.

GPS_df = GPS_df.drop(columns=['lat_shift', 'lon_shift'])
Akanksha Atrey
  • 780
  • 4
  • 8
  • Thanks. I started with iterating through the dataframe. That was slow. Then I was trying to use the shift() in the function definition and soon learned that it does not work that way and then I landed on this solution. – tbone Sep 28 '20 at 20:16