-1

I am working on the calculation of the distance between two global positions based on their coordinates. When using only two locations, I get results:

def global_distance(location1, location2):
    lat1, lon1 = location1
    lat2, lon2 = location2
    radius = 6371 # radius of the Earth

    dlat = math.radians(lat2-lat1)
    dlon = math.radians(lon2-lon1)
    a = math.sin(dlat/2) * math.sin(dlat/2) + math.cos(math.radians(lat1)) \
        * math.cos(math.radians(lat2)) * math.sin(dlon/2) * math.sin(dlon/2)
    c = 2 * math.atan2(math.sqrt(a), math.sqrt(1-a))
    d = radius * c

    return d

lat1 = 55.32; lat2 = 54.276; long1 = -118.8634; long2 = -117.276


print( global_distance((lat1, long1), (lat2, long2)))

What if I want to calculate the distance between several locations? Assuming I have a CSV file containing three locations:

Location   Lat1      Long1      
 A         55.322    -117.17
 B         57.316    -117.456
 C         54.275    -116.567

How can I iterate over these two columns and produce distances between (A,B), (A,C) and (B,C) ?

nyi
  • 3,123
  • 4
  • 22
  • 45
user_01
  • 447
  • 3
  • 9
  • 16
  • First, how are you reading that CSV in? A list of lists/tuples, dicts, or objects, a 2D numpy array, a pandas DataFrame, …? – abarnert May 16 '18 at 22:41
  • I mean I want to read it as pandas dataframe and want to iterate over all three elements. But any way to read it is ok. I just do not know how to iterate correctly on them – user_01 May 16 '18 at 22:44

2 Answers2

2

Assuming you've read that CSV into some kind of sequence of sequences (e.g., list(csv.reader(f))), all you need to do is iterate over all combinations of locations. And that's exactly what itertools.combinations does:

>>> locs = [('A', 55.322, -117.17), ('B', 57.316, -117.456), ('C', 54.275, 116.567)]
>>> for (loc1, lat1, lon1), (loc2, lat2, lon2) in itertools.combinations(locs, 2):
...     print(loc1, loc2, global_distance((lat1, lon1), (lat2, lon2)))
A B 222.42244003744995
A C 122.66829007875741
B C 342.67144769115316

While you're looking at the linked docs above, notice combinations_with_replacement, permutations, and product, which are often the answers to similar but slightly different problems.

This should be easy to adapt to a sequence of dicts, or a dict of Location instances, etc. If, on the other hand, you have something like a 2D numpy array or a pandas DataFrame, you may want to do something different. (Although from a quick search, it looks like just making an array out the itertools combinations with fromiter isn't significantly slower than anything else, even if you want to trade of space for time to broadcast your global_distance function.)

abarnert
  • 354,177
  • 51
  • 601
  • 671
1

I'd import the data from your file via pandas:

import pandas as pd
df = pd.read_table(filename, sep='\s+', index_col=0)

Additionally, you can import itertools:

import itertools as it

With this you can get all combinations of a iterable like this, as an example with the indices of the dataframe here:

for i in it.combinations(df.index, 2): print(i)
('A', 'B')
('A', 'C')
('B', 'C')

This shows, that you'll get the combinations you want. Now do the same with the data of your dataframe:

for i in it.combinations(df.values, 2): print(global_distance(i[0], i[1]))
222.4224400374507
122.66829007875636
342.671447691153

And if you want to have the location names included in the output, you can leave away the index_col=0 when importing, so that A, B and C is also part of the df.values and you could write:

for i in it.combinations(df.values, 2): print(i[0][0], '-', i[1][0], global_distance(i[0][1:], i[1][1:]))
A - B 222.4224400374507
A - C 122.66829007875636
B - C 342.671447691153
SpghttCd
  • 10,510
  • 2
  • 20
  • 25