As you say, adding new edges and nodes is still efficient thanks to the union operation. The problematic cases are edge removals (node removals translate to the removals of its edges).
When you remove an edge, there are two possible outcomes: the connected component remains the same, or a new one is created (the edge was joining two subsets that now are disconnected). That already gives us some good news: we only have to worry about the subset where the edge belonged.
So, after you remove the edge (A,B)
, start two DFS, one from A
and the other from B
and interleave them.
- If any of the searches finds the other node, the structure remains the same, because the subset is still connected.
- If one of the searches finishes without finding the other one, that will mean that you have divided the set in two subsets, and you just found the smallest one. In that case, you can just create a new tree from the one that didn't contain the leader of the original one (by the end of the search, you should know which one it is).
Here you have some Python-like pseudocode for this algorithm:
def delete_edge(A, B):
head = find_head(A) # or B, it's the same
start_DFS(A)
start_DFS(B)
while True:
found_from_A = step_DFS(A)
found_from_B = step_DFS(B)
if found_from_A == B or found_from_B == A:
return # subset is still connected
if found_from_A == head:
who_contains_head = A
if found_from_B == head:
who_contains_head = B
if found_from_A is None:
# A finished
if who_contains_head == A:
spawn_new_tree(B)
else:
spawn_new_tree(A)
return
if found_from_B is None:
# B finished
if who_contains_head == B:
spawn_new_tree(A)
else:
spawn_new_tree(B)
return
You'll find other alternatives in the page of wikipedia about dynamic connectivity, but this one seemed simple and fast enough.