0

I am using Python 2.6.6 on Centos 6.
I have a dataframe that I am bringing in from a pickle file. Then I'd like to calculate the distance between 2 points. I have tried to combine the lat and long for each point into a tuple and then used Geopy.great_circle. However the traceback includes this:

/opt/rh/python27/root/usr/lib/python2.7/site-packages/geopy/point.pyc in __new__(cls, latitude, longitude, altitude)
127                     )
128                 else:
--> 129                     return cls.from_sequence(seq)
130 
131         latitude = float(latitude or 0.0)

/opt/rh/python27/root/usr/lib/python2.7/site-packages/geopy/point.pyc in from_sequence(cls, seq)
351         """
352         args = tuple(islice(seq, 4))
--> 353         return cls(*args)
354 
355     @classmethod

TypeError: __new__() takes at most 4 arguments (5 given)

My input is from a Pandas DataFrame that should be of the same length (if that matters?)

import numpy as np
from geopy.distance import vincenty
import geopy
import pandas as pd

distances_frame = pickle.load(open("distances.p", "rb"))
samp = distances_frame.sample(n=50)
samp = samp.dropna()
point1 = tuple(zip(samp['biz_lat'],samp['biz_lon']))
point2 = tuple(zip(samp['id_lat'],samp['id_lon']))
dist= (vincenty(point1,point2).miles)
Lafexlos
  • 7,618
  • 5
  • 38
  • 53
PR102012
  • 846
  • 2
  • 11
  • 30
  • You should provide more info, print the point1 and point2 tuples. – xtrinch Jan 05 '16 at 21:13
  • **The types for 'point1' & 'point2' produces**...._ _...**printing their 1st 3 values**... _((41.681092491999998, -87.680967924000001), (41.909741670999999, -87.745637481000003), (41.910972818000005, -87.640785577000003))_ .....**They (4 cols) were 'float64' in Pandas DataFrame before converting.** – PR102012 Jan 05 '16 at 21:20
  • But don't you want (for example) (41.681092491999998, -87.680967924000001) to represent a point? What you have is three points inside a tuple. – xtrinch Jan 05 '16 at 21:26
  • I printed 3 lines for that, for instance, _print point1[:1], point2[:1]_ ....gives me _((41.681092491999998, -87.680967924000001),) ((41.934428855871488, -87.716208209746739),)_ – PR102012 Jan 05 '16 at 21:33
  • I can't find any info anywhere that says you could put tuples like that into the vincenty function. What are you expecting as output? A list of distances? – xtrinch Jan 05 '16 at 21:38
  • Yes. I had tried with the 'great_circle' function as well. I guess I had assumed that I have 2 separate columns of lat and lon (4 total cols) and I should create a lat/lon tuple in order to find the distances between those 2 geographic coordinates. – PR102012 Jan 05 '16 at 21:41
  • IIUC I think you want: `samp.apply(lambda x: vincenty((x['biz_lat'],x['biz_lon']), (x['id_lat'], x['id_lon'])).miles, axis=1)` – EdChum Jan 05 '16 at 21:54
  • You are correct, I posted the answer. I can delete mine if you add yours. – PR102012 Jan 06 '16 at 15:38

1 Answers1

2

EDIT 'EdChum' has the correct answer in comments above..

samp.apply(lambda x: vincenty((x['biz_lat'],x['biz_lon']), (x['id_lat'],   x['id_lon'])).miles, axis=1)
PR102012
  • 846
  • 2
  • 11
  • 30