The time complexity of Prim's algorithm is O(VlogV + ElogV). It seems like you understand how the VlogV
came to be, so let's skip over that. So where does ElogV
come from? Let's start by looking at Prim's algorithm's source code:
| MST-PRIM(Graph, weights, r)
1 | for each u ∈ Graph.V
2 | u.key ← ∞
3 | u.π ← NIL
4 | r.key ← 0
5 | Q ← Graph.V
6 | while Q ≠ Ø
7 | u ← EXTRACT-MIN(Q)
8 | for each v ∈ Graph.Adj[u]
9 | if v ∈ Q and weights(u, v) < v.key
10| v.π ← u
11| v.key ← weights(u, v)
Lines 8-11 are executed for every element in Q
, and we know that there are V
elements in Q
(representing the set of all vertices). Line 8's loop is iterating through all the neighbors of the currently extracted vertex; we will do the same for the next extracted vertex, and for the one after that. Djistkra's Algorithm does not repeat vertices (because it is a greedy, optimal algorithm), and will have us go through each of the connected vertices eventually, exploring all of their neighbors. In other words, this loop will end up going through all the edges of the graph twice at some point (2E
).
Why twice? Because at some point we come back to a previously explored edge from the other direction, and we can't rule it out until we've actually checked it. Fortunately, that constant 2
is dropped during our time complexity analysis, so the loop is really just doing E
amounts of work.
Why wasn't it V*V
? You might reach that term if you just consider that we have to check each Vertex and its neighbors, and in the worst case graph the number of neighbors approaches V
. Indeed, in a dense graph V*V = E
. But the more accurate description of the work of these two loops is "going through all the edges twice", so we refer to E
instead. It's up to the reader to connect how sparse their graph is with this term's time complexity.
Let's look at a small example graph with 4 vertices:
1--2
|\ |
| \|
3--4
Assume that Q
will give us the nodes in the order 1, 2, 3, and then 4.
- In the first iteration of the outer loop, the inner loop will run 3 times (for 2, 3, and 4).
- In the second iteration of the outer loop, the inner loop runs 2 times (for 1 and 4).
- In the third iteration of the outer loop, the inner loop runs 2 times (for 1 and 4).
- In the last iteration of the outer loop, the inner loop runs 3 times (for 1, 2, and 3).
The total iterations was 10, which is twice the number of edges (2*5
).
Extracting the minimum and tracking the updated minimum edges (usually done with a Fibonacci Heap, resulting in log(V)
time complexity) occurs inside the loop iterations - the exact mechanisms involve end up needing to occur inside the inner loop enough times that they are controlled by the time complexity of both loops. Therefore, the complete time complexity for this phase of the algorithm is O(2*E*log(V))
. Dropping the constant yields O(E*log(V))
.
Given that the total time complexity of the algorithm is O(VlogV + ElogV)
, we can simplify to O((V+E)logV)
. In a dense graph E > V
, so we can conclude O(ElogV)
.