0

I'm tackling an interesting question in programming. It is this: we keep adding undirected edges to a graph, until the graph (or subgraph) is connected (i.e. we can use some path to get from each vertex to any other vertex in that subgraph). We stop as soon as the graph is connected. For example if we have vertices 1,2,3 and 4 and we want the subgraph 1,2,3 to be connected. Let's say we have edges (3,4), then (2,3), then (1,4), then (1,3). We only need to add in the first 3 edges for the subgraph to be connected, then we stop (edge 1,3 isn't needed). Obviously I can run a BFS every time an edge is added to see if we can reach the required vertices, but if there are say m edges then we would potentially have to run BFS m times which seems too slow. Any better options? Thanks.

TheLeogend
  • 99
  • 4

3 Answers3

1

You should research the marvelous "Disjoint-set data structure" and the corresponding union - find algorithm. It can seem magical, but the worst case time and space complexity are tiny, O(α(n)) and O(n) respectively, where α is the inverse Ackerman function.

Doug Currie
  • 40,708
  • 1
  • 95
  • 119
  • Thanks. That was what I was thinking about. The issue is, how do I efficiently know exactly when all the vertices in my subgroup are connected. Do I run the find algorithm for each of the vertices and see if the parent is same for all? – TheLeogend Jan 18 '20 at 23:31
  • @TheLeogend Assuming you don't add unnecessary edges which connect vertices that are already in the same component, then the whole graph is connected if and only if you have added n - 1 edges. If you do add unnecessary edges, just maintain a counter of how many edges you've added which weren't unnecessary. – kaya3 Jan 18 '20 at 23:34
  • Sorry, but that is my problem in the first place. I have to add the edges in that specified order, or else the problem will be way easier. And, my example contradicts your statement (my subgraph has 3 vertices and I added 3 edges) – TheLeogend Jan 18 '20 at 23:43
  • If you have to add the edges in that specified order then you do have to add edges which are unnecessary for the graph to be connected. Only two of your edges are necessary, and the other one is unnecessary. You can tell when an edge is unnecessary because the two vertices are already in the same component (as reported by the `find` operation on your union-find data structure). If you count the number of edges added where the two vertices were in different components (i.e. the necessary ones), then the graph is connected if and only if that counter reaches n - 1. – kaya3 Jan 19 '20 at 00:04
  • Thanks. So which of my edges wasn't necessary? – TheLeogend Jan 19 '20 at 01:05
0

You can run just one time the BFS to find connected components. Then, each time you add an edge, if it is between vertices of two different components, you can merge them by a reference. So, the complexity of this algorithm is |V| + |E|.

Notice that the implementation of this method should be done by some reference techniques, especially to update the component number of the vertices.

OmG
  • 18,337
  • 10
  • 57
  • 90
  • Thanks. What do you mean by reference techniques? I can easily take the union of two connected sets, say, using disjoint union sets. But how do I know when to stop, i.e. all the vertices in the subgraph are contained within a connected component? – TheLeogend Jan 18 '20 at 23:26
  • @TheLeogend I mean, all vertices of a component have a reference to the memory of the variable that stores the component number. Then, we can update the component number of the related vertices just by changing the value of the variable. – OmG Jan 18 '20 at 23:29
  • @OmG Updating all the vertices to point to the new component is significantly slower than the standard disjoint-set data structure Doug Currie mentioned. – Sneftel Jan 18 '20 at 23:35
  • @Sneftel No. It's not! I've explained it! It depends on the implementation technique. And it can be done very fast as I've mentioned. – OmG Jan 19 '20 at 01:25
  • It really is. I know it sounds like it's not, with the idea of updating the pointed-to set number rather than individually updating all the merged vertices... But once you've done that, not all members of the set reference the same memory location, so you can't do it in constant time again. Try it out on paper, recursively merging pairs, and you'll see. It would be nice if your idea worked, since it would give exactly the same capabilities as the disjoint-set data structure but be simpler and faster, but it's also theoretically impossible. – Sneftel Jan 19 '20 at 07:12
0

I would normally do this using a disjoint set structure, as Doug suggests. It would be like Kruskal's algorithm for finding the minimum spanning tree, except you process edges in the given order.

If you don't need a spanning tree as output, though, then you can do this with an incremental BFS or DFS:

  1. Pick any vertex, and find the vertices connected to it with BFS or DFS. Color these vertices red. If you start with no edges, of course, then there will be only one red vertex at this stage.
  2. As you add edges, don't do anything else until you add an edge that connects a red vertex to a non-red vertex. Then run BFS or DFS, excluding the new edge, to find all the new vertices that will connect to the red set. Color them all red.
  3. Stop when all vertices are red.

This is a little simpler in practice than using disjoint set, and takes O(|V|+|E|) time, since each vertex will be traversed by exactly one BFS/DFS search.

It does the work in chunks, though, so if you need each edge test to be fast individually, then disjoint set is better.

Matt Timmermans
  • 53,709
  • 3
  • 46
  • 87