2

I have a set of Lat, long points for a city. Now I want to cluster these points based on 500m radius or 1km radius using R. Precisely, I want to find to find out centroids as well as all those points within 500m radius for that particular cluster.

PS:

1.I have used k means. But I cant fix radius in K - means. 2. I tried using Leadercluster package in R. After I map clusters to points, and find the distance from centroid, I found out that there are lot points tagged to cluster more than specified radius in Leadercluster package.

My question is exactly like the one in this link: https://gis.stackexchange.com/questions/146701/convert-eps-to-geographic-distance-using-dbscan I am looking for a R solution

Please suggest a nice way to cluster these points based on radius.

Thanks in Advance

Community
  • 1
  • 1
Swetha K V
  • 43
  • 1
  • 6

2 Answers2

0

Use hierarchical clustering.

With maximum linkage, cut at the desired height, you can ensure a maximum distance in each cluster.

With centroid linkage, the distance from the center should be bounded, but this may be limited to Euclidean distances?

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
  • I am new to R. Can you please tell me how to do it. I have latitude and longitude points , I am guessing taking Euclidean distance would be wrong measure to limit distance. – Swetha K V Feb 23 '16 at 06:26
  • I don't use R much, it can be very slow. You need the Haversine distance. – Has QUIT--Anony-Mousse Feb 23 '16 at 07:27
  • I am also trying to use DBSCAN which needs distance matrix as input. Is there a way where I can give Spatial points as input – Swetha K V Feb 23 '16 at 09:42
  • Yaa..when I increase the number of data points lets say 1lakh points , it has to form 1L x 1L matrix which was approx 28GB and this fails because of Memory constraints – Swetha K V Feb 23 '16 at 12:46
  • Yes, a distance matrix may be too expensive. The only DBSCAN version I know that does support Haversine and indexes for acceleration is the one in ELKI, not R. It's very fast. – Has QUIT--Anony-Mousse Feb 23 '16 at 13:18
  • I donot know Java. Can you re direct me to the any page where i can get complete code for clustering geo points using ELKI – Swetha K V Feb 24 '16 at 08:09
  • You don't need to write Java. It has a minimalistic UI that is enough to do the clustering. – Has QUIT--Anony-Mousse Feb 24 '16 at 09:28
0

You can draw circles around the points https://gis.stackexchange.com/questions/121489/1km-circles-around-lat-long-points-in-many-places-in-world and then merge them with gUnion. The new polygons will be the clusters with points close to each other. A simple way to get the centroid is to take the mean of the lat and lon of the points belonging to each new polygon.

Community
  • 1
  • 1
Chris
  • 2,256
  • 1
  • 19
  • 41
  • I would be doing this on huge set of Data. I have approximately 2lakh location points for a city. By clustering I want see, which cluster has maximum density in 500m radius. I want the algorithm to pick centroids points in such a way that , that cluster would have max points available in that range – Swetha K V Feb 23 '16 at 06:29
  • I do something similar, with hundreds of thousands of points clustered all over the world. I found that clustering-algorithms based on distances took too long time (because 500m of lat lon is not the same everywhere and needs to be calculated). To convert it into points and polygons of library(sp) classes, and then use their functions proved to be the fastest in my case. Hierarchical clustering also can be tricky if you have two small clusters close and a big one further away. But can be done of course. – Chris Feb 23 '16 at 12:25
  • Maybe this can help: . I answered it with R-code. – Chris Feb 23 '16 at 16:39