-1

I want to calculate the distance between two coordinates points(Lat1,long1, and Lat2,Long2) for the below data frame.

`name_x rnc_x   lat1    long1   scrambling_code name_y  rnc_y   lat2    long2
UI11481 MURNC09 72.82584     19.01234   121 UI11481 MURNC09 72.82584     19.01234
UI11481 MURNC09 72.82584     19.01234   121 UI37616 MURNC09 72.8282  19.01753
UI11481 MURNC09 72.82584     19.01234   121 UM13167 MURNC04 72.85002     19.09671
UI11481 MURNC09 72.82584     19.01234   121 UM12606 MURNC12 72.8563  19.18566
UI11481 MURNC09 72.82584     19.01234   121 UI17997 MURNC01 72.82161     18.92689
UI11481 MURNC09 72.82584     19.01234   121 UM36021 MURNC07 72.8816  19.1771
UI11481 MURNC09 72.82584     19.01234   121 UM30099 MURNC12 72.871   19.2173
UI11481 MURNC09 72.82584     19.01234   121 UM2411  MURNC17 72.8599  19.2498
UI11481 MURNC09 72.82584     19.01234   121 UM41377 MURNC22 72.8531  19.0142
UI11481 MURNC09 72.82584     19.01234   121 UM35501 MURNC08 72.8538  19.3042
UI11481 MURNC09 72.82584     19.01234   121 UM6086  MURNC15 72.8046  18.9728
UI11481 MURNC09 72.82584     19.01234   121 UI28816 MURNC14 72.821753    19.060517

`

Joel
  • 1,564
  • 7
  • 12
  • 20
DHANANJAY CHAUBEY
  • 107
  • 1
  • 2
  • 10
  • [How to create a Minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve). – Lev Zakharov Aug 10 '18 at 21:12
  • 1
    Welcome to StackOverflow. Please read and follow the posting guidelines in the help documentation, as suggested when you created this account. [Minimal, complete, verifiable example](http://stackoverflow.com/help/mcve) applies here. We cannot effectively help you until you post your MCVE code and accurately describe the problem. We should be able to paste your posted code into a text file and reproduce the problem you described. – Prune Aug 10 '18 at 21:13
  • I have modified the exact question. – DHANANJAY CHAUBEY Aug 10 '18 at 23:01

1 Answers1

3

If I understand you correctly I have come up with two possible solutions:

  1. df['distance'] = np.sqrt((df.lat2 - df.lat1) ** 2 + (df.lon2 - df.lon1) ** 2)
  2. df['distance'] = np.linalg.norm(df[["lat1", "lon1"]].values - df[["lat2", "lon2"]].values, axis=1)

Edit: Thanks for your comment. I misunderstood your question.

So to solve that first I need a function which will recalculate distances for me. For this purpose I used:

from math import sin, cos, sqrt, atan2, radians
def calculate_distance(lat1, lon1, lat2, lon2):
    R = 6373.0

    lat1 = radians(lat1)
    lon1 = radians(lon1)
    lat2 = radians(lat2)
    lon2 = radians(lon2)

    dlon = lon2 - lon1
    dlat = lat2 - lat1

    a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))

    return R * c

For more information about this calculation please see link

This will give me distance between two points. Having this information I applied it to every row:

df['distance'] = [calculate_distance(**df[['lat1', 'lon1', 'lat2', 'lon2']].iloc[i].to_dict()) for i in range(df.shape[0])]

This gave me this result:

    lat1    lat2    lon1    lon2    distance
1   54  52  54  52  259.614032
2   23  24  56  65  924.586291

Please try it :)

Dawid_Sielski
  • 384
  • 2
  • 6
  • The distance value is coming wrong as it is the earth position it can't calculated sqrt(x2-x1,y2-y1), The distance will be spherical distance. – DHANANJAY CHAUBEY Aug 15 '18 at 15:13
  • Thanks @Dawid_Sielski , the solution work but problem it's run very slow , I have 5 million entry like this, any solutions which can solve it faster. – DHANANJAY CHAUBEY Aug 15 '18 at 21:01