2

I have a dataframe that looks like this:

lat lon time
45.23 81.3459 2021-04-25 12:31:15
45.265 81.346 2021-04-25 12:33:46
45.278 81.347 2021-04-25 12:40:23
....... ......... ....................

I have thousands of rows in the dataFrame. What I want to do is find the distance of each point from the first point, i.e. (45.23, 81.3459), but I'm stumped trying to figure out how to do it. I figure geodesic should do the trick, no?

Any suggestions would be great!

petezurich
  • 9,280
  • 9
  • 43
  • 57
  • are you looking for the mathematical way to calculate the distance between 2 points on a sphere (the earth): https://en.wikipedia.org/wiki/Haversine_formula – SiP Nov 03 '22 at 22:47
  • More so the way to make it so that the formula uses the first coordinates as the reference point (i.e., calculates the distance of each subsequent coordinate point from that first one) – matrix_season Nov 04 '22 at 14:17

1 Answers1

3

With the dataframe you provided:

import pandas as pd

df = pd.DataFrame(
    {
        "lat": [45.23, 45.265, 45.278],
        "lon": [81.3459, 81.346, 81.347],
        "time": ["2021-04-25 12:31:15", "2021-04-25 12:33:46", "2021-04-25 12:40:23"],
    }
)

Here is one way to do it using GeoPy distance:

df["dist_from_base_in_miles"] = df.apply(
    lambda x: distance.distance(
        (x["lat"], x["lon"]), (df.loc[0, "lat"], df.loc[0, "lon"])
    ).miles,
    axis=1,
)

Then:

print(df)
# Output
      lat      lon                 time  dist_from_base_in_miles
0  45.230  81.3459  2021-04-25 12:31:15                 0.000000
1  45.265  81.3460  2021-04-25 12:33:46                 2.417003
2  45.278  81.3470  2021-04-25 12:40:23                 3.315178
Laurent
  • 12,287
  • 7
  • 21
  • 37