I have many points (latitudes and longitudes) on a plane (a city) and I want to find two clusters. Cluster 1 is points cluttered close together and Cluster 2 is everything else.
I know the definition of the problem is not exact. The only thing defined is that I need exactly 2 clusters. Out of N points, how many end up in cluster 1 or cluster 2 is not defined.
The main aim is to identify points which are very close to each other and separate them from the rest (which are more more evenly spread out)
The best I can think of is the following algorithm:
1. For each point, Calculate the sum of the square distances to all other points.
2. Run the k-means with k=2 on these square distances
The squaring (or maybe even higher order) of the distance should help by raising the dimensionality. However this algorithm will be biased towards points near the center of the city. It will struggle to find clusters at the edges of the city.
Any suggestions on how to avoid this problem? And any other suggestions to improve this algorithm