3

Basically, my experimental program is trying to find the number of points that fall within a (e.g., 50km) radius of a valid point at a given time. My data is structured (but I can restructure if need-be) in three separate arrays such:

1_LAT,1_LON,1_TIM

Where 1_LAT,1_LON,1_TIM all contain roughly ~250 values corresponding to Latitude, Longitude (decimal degrees), and time respectively.

I have 20 sets of these arrays (i.e., 1_LAT,1_LON,1_TIM...20_LAT,20_LON,20_TIM).

Here is what I would like to accomplish:

1) Figure out the number of lat/lon sets that fall within a particular radius of each set. For example, how many points fall within a 50km radius of 1_LAT,1_LON at the valid time of 1_TIM from the other 19 sets of points. I would then like to iterate through each valid time to figure out the number of points in the valid radius at each specific point and valid time.

I have attached a picture below to help visually describe. sampleimage

The black squares represent all the points in the LAT_1/LON_1 arrays. The blue squares represent all the points in the LAT_n/LAT_n arrays.

I would like to count the number of points in each radius at each valid time for each set of lat/lon arrays. The final display would be a summed raster or meshgrid of the denisty (i.e., number of counts / 20) for each grid spot on a geographic basemap image.

I have a feeling that a KDEtree may be the best way to accomplish this, but I have little/no experience with such. Any ideas or suggestions would be greatly appreciated.

wuffwuff
  • 730
  • 2
  • 9
  • 19

1 Answers1

2

You would do something like the following... First, group your (x, y) coordinates for each group in a single points_x array:

points_1 = np.column_stack((LAT_1, LON_1))
...
points_n = np.column_stack((LAT_n, LON_n))

It may be a good idea to store them in a list of arrays:

points = [point_1, points_2, ..., points_n]

Now, make a kdTree out of each set of points:

import scipy.spatial as spsp
kdtrees = [spsp.cKdTree(p) for p in point]

And you are ready to go. If you now run the following code:

r = whatever_your_threshold_value_is
points_within_r = np.zeros((len(kdtrees), len(kdtrees)), dtype=np.int)
for j in xrange(len(kdtrees)):
    for k in xrange(j+1, len(kdtrees)):
        points_within_r[j, k] = kdtrees[j].count_neighbors(kdtrees[k], r, 2)
points_within_r = points_within_r + points_within_r.T

You should now find that points_within_r[j, k] holds how many points in points_j are within radius r of a point in points_k.

Keep in mind that distances here are the euclidean distance of the coordinates, disregarding the fact that what they measure are spherical angles.

Jaime
  • 65,696
  • 17
  • 124
  • 159
  • Since these values are lat/lon values, there must be a more accurate way to figure out if the points are within a radius. I've downloaded geopy (and can get more accurate measurements). I will try to modify the code to get such values, but I still need to get all of this to map in basemap. I may forego this solution and try something with the shapely library. – wuffwuff Jul 30 '13 at 22:23
  • If you are worried about lat/lon vs xyz KDTree will take any dimensional argument along the last axis. Just convert your points_n array from lat/long to xyz. – Daniel Jul 30 '13 at 22:40
  • But it won't measure distance along the surface of the sphere, but directly through the Earth, right? – Jaime Jul 30 '13 at 22:43
  • Somewhat- 50km radius is pretty small compared to the curvature of the earth. At 50km both points only need to be a few meters above the surface to avoid passing through the earth. Assuming its a perfect sphere and no trees... – Daniel Jul 30 '13 at 22:53
  • I'm wondering if there is an easier solution using the Shapely library buffer tool and then "summing"/overlaying the polygons into some sort of heat density map. – wuffwuff Jul 30 '13 at 22:56
  • This previous solution has an error as count_neighbors does not appear to be an attribute for kdtrees. Eithway, the solution ignores that I am only looking for the number of points that fall within a certain radius at a particular time (i.e., index value). I envision a plot that looks similar to this image of hurricane track density. http://www.atmos.albany.edu/facstaff/tang/tcguidance/ep902013/0_es1.png – wuffwuff Jul 30 '13 at 23:46
  • You may not have the latest release of Scipy, [it is there in 0.12](http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.cKDTree.count_neighbors.html), and I was simply trying to illustrate how `cKDTree` works: you get to have all the fun of segmenting your data and deciding what to compare to what. – Jaime Jul 31 '13 at 00:22
  • I wonder if the same effect could be created by using a KDE density filter? – wuffwuff Jul 31 '13 at 01:00
  • RuntimeWarning: invalid value encountered in maximum self.maxes = np.maximum(maxes,mins).astype(float) –  Feb 10 '18 at 12:52