0

I have a about 200,000,000 sequences of the coordinates, one of them looks like this:

[[30.654508, 104.086337], [30.654878999999998, 104.085479], [30.655219, 104.084691], [30.655027, 104.084949], [30.654555, 104.086073], [30.654078000000002, 104.08710500000001], [30.653726000000002, 104.087848], [30.653376, 104.088552], [30.653190000000002, 104.088975]]

Now I want to calculate the sum of haversine distance between each point in one sequence, which means the total length of this trajectory.

I use this code to calculate haversine distance between two points:

def cal_haversine(point1, point2):
    lat_o,lon_o, lat_d,lon_d = point1[0][0],point1[0][1],point2[0][0],point2[1][1]
    sin = math.sin
    cos = math.cos
    atan2 = math.atan2
    sqrt = math.sqrt

    lon1,lat1 = lon_o,lat_o
    lon2,lat2 = lon_d, lat_d
    R=6371000 #metres
    phi1=lat1 * (3.1415 / 180)
    phi2=lat2 * (3.1415 / 180)
    Dphi= phi2 - phi1
    Dlambda = (lon2 -lon1) *  (3.1415 / 180)

    a = sin(Dphi / 2) ** 2 + cos(phi1)*cos(phi2) *sin(Dlambda/2)**2
    c = 2 * atan2(sqrt(a),sqrt(1-a))
    d = R*c/1000.0 # to kilometer
    return d

Since there is too much sequences to calculate, so using loop to calculate the distance of all sequences may very slow.

I guess this kind of calculation may have a faster way, such as using vectorized method. Is there any better way ?

Mitch Wheat
  • 295,962
  • 43
  • 465
  • 541
jjdblast
  • 525
  • 1
  • 8
  • 26

0 Answers0