-2

I'm trying to find out what's the complexity of Kruskal's disjoint-set determining loops.

In Kruskal's, we adding vertices into disjoint sets and creating unions of these sets if we added an edge which has both vertices in different sets as far as I'm concerned. That's how we determining loops - if two vertices are already in one set, we don't add this edge.

How do we check whether a vertex is in the set? In my opinion, it is in O(n) so Kruskal would be at least in O(n^2) + sorting time of edges O(nlogn).

For example here SO Answer, which is my problem too, they say that it is possible to run Kruskal in O(V+E) when you have just two types of edges. I get it, but I don't get the thing with disjoint sets, I think that it creates worse complexity.

Community
  • 1
  • 1
Milano
  • 18,048
  • 37
  • 153
  • 353
  • 1
    Have you tried http://google.com/?#q=kruskal? – Gassa May 12 '16 at 07:37
  • Yes, of course. I have a problem with disjoint sets. – Milano May 12 '16 at 07:42
  • If you follow the googled link to Wikipedia, there is a link to the disjoint-set data structure in the article. Please read that article. – Gassa May 12 '16 at 07:53
  • The problem is, your question is currently ambiguous. It is possible to implement disjoint sets trivially, then uniting two sets will be O(n). It is possible to use the aforementioned data structure and get as low as inverse Ackermann function of n as the complexity. It is even possible to implement it worse than O(n). You do not specify which disjoint-set implementation you are talking about, so we can't guess your complexity. – Gassa May 12 '16 at 07:55
  • In fact, my problem is that I'm computing a time complexity of Kruskal's algorithm, where edges are from {1,2}. This means that we can sort them in linear time. I've read that in this case, Kruskal's complexity can be O(|E|+|V|) which I don't get, because E is spent on sorting the edges and V should be time of adding edges. Now if I am right, then all cycle checking should be in linear time. So that's why I'm askong about disjoint sets. – Milano May 12 '16 at 08:05
  • O(|E|+|V|) is not just one |E| plus one |V|. It just means that an upper bound is some constant multiplied by |E|+|V|. It can be read as "|E|+|E|+|V|" if there are two |E| steps. – Gassa May 12 '16 at 08:09
  • Yes, but checking for a loop should be done in |V| times so if it was not constant, the complexity should be at least |V|*|V| or |V|*log(|V|) – Milano May 12 '16 at 08:11
  • I see now. I'd say the complexity with Disjoint-Set data structure is O(|E| + |V|α(|V|)). Perhaps, since α(|V|) is very small for foreseeable inputs, it was considered constant... – Gassa May 12 '16 at 08:17
  • Yes probably, thats what I'm afraid of. It's not linear in exact meaning. Howewer, I've found another way to find minimum spanning tree in O(E+V), but I think that there is a mistake too. Because they say that they just keep choosing the smallest edges, but it's not correct since you have to have one component during building a tree. http://stackoverflow.com/a/17496326/3371056 So you would have to search in linear time in those two lists. – Milano May 12 '16 at 08:20
  • Hmm, right, there is actually an O(|E|+|V|) solution. For example, it can be expressed as 1-2 BFS, similar to 0-1 BFS. Start from any vertex and maintain a deque of edges and a boolean array to track which vertices are visited. When you visit a new vertex, add all weight-1 edges from it to the top of the deque and all weight-2 edges to the bottom of the deque. When you take an edge from the top of the deque, if it is to a non-visited vertex, visit that vertex and add the edge to the tree. Otherwise, ignore the edge. Unfortunately, this algorithm is hardly an answer to your original question. – Gassa May 12 '16 at 08:59

1 Answers1

1

You can check whether a vertex is in a set by checking the disjoint-set container.

Disjoint sets have, at most, O(α(n)) time complexity for any operation. Here α is the inverse Ackerman function, which is less than 5 for any input you, or anyone else, will ever use.

A disjoint set is like a forest of tree structures. The root of the tree uniquely identifies a set. Disjoint sets achieve their black magic by ensuring that when two sets are joined, they are always joined so as to make the shortest possible path to the root. Additionally, whenever a set membership check is performed on a set, it collapses every node on the path to the root such that those nodes can now have their membership checked in O(1) time.

The upshot is that Kruskal's algorithm ends up spending O(α(n)) time on each operation with the set, which is, practically speaking, the same as spending no time on it.

parvin
  • 195
  • 2
  • 12
Richard
  • 56,349
  • 34
  • 180
  • 251