2

I have 10^6 points in 2D. Every point has to be connected, and I should minimize the sum of those distances (between points). My algorithm is this:

  1. Make a dict 'Visited', that marks if the point is visited.
  2. Find the closest point to all 10^6 points.
  3. Then sort array of 10^6 points by a distance of a closest point.
  4. Then for loop(1000000):(all points!) if p1 is not visited, connect it to the closest point

When I sort array by distance reversed, I get a different answer. How is this possible, and how should I optimize this?

for i in range(1000000):
    x1, y1 = arr[i][0], arr[i][1]
    x2, y2 = arr[i][2], arr[i][3]

    if (x1,y1) not in visited:
        visited[(x1,y1)] +=1
        visited[(x2,y2)] +=1

        xs = (x1 + x2) / 2
        ys = (y1 + y2) / 2


        mid_point[(xs,ys)] = [(x1,y1), (x2,y2)]

I calculate radius (half distance*half distance*PI, not so important in this case, because r*r is also linear)

kcsquared
  • 5,244
  • 1
  • 11
  • 36
  • what is suncobrani? – randomer64 Oct 24 '21 at 14:26
  • 1
    This is not going to add up to a correct MST algorithm. You should use SciPy to find the Delaunay triangulation and then compute its minimum spanning tree (irritatingly, while SciPy implements the hard parts of both of these algorithms, you'll need to write your own conversion from the Delaunay triangulation to a compressed sparse graph (csgraph)). – David Eisenstat Oct 24 '21 at 14:29
  • Ty David, that was very helpful! – Python Newbie Oct 24 '21 at 14:36
  • This won't result in a spanning tree. For example, consider points on the x axis with y=0,1,100, and 101. Then your algorithm will get 2 connected components: between the first and last two points. – Alex Li Oct 25 '21 at 02:11
  • It doesn't have to be connected spanning tree, request is that each point is connected(only that), there can bi islands/isolated two points – Python Newbie Oct 25 '21 at 09:26
  • How does distribution of points look like? Maybe all you need is some clustering algorithm and for each point finding minimum distance within each cluster (and optionally surrounding clusters). – dankal444 Oct 25 '21 at 20:51
  • I used scipy, and found for each point minimim distance point. But from there, i just connected them with closest(if those points are not previously connected, first one, like in code) – Python Newbie Oct 25 '21 at 21:09

0 Answers0