Time Complexity of Array based Disjoint-Set data structure

Question

I was solving this question on CodeChef and going through the editorial.

Here's the pseudo-code for the implemented disjoint-set algorithm :

Initialize parent[i] = i  
Let S[i] denote the initial array.

int find(int i)
    int j
    if(parent[i]==i)
                return i
    else
        j=find(parent[i])
        //Path Compression Heuristics
        parent[i]=j
        return j

set_union(int i,int j)
    int x1,y1
    x1=find(i)
    y1=find(j)
    //parent of both of them will be the one with the highest score
    if(S[x1]>S[y1])
        parent[y1]=x1
    else if ( S[x1] < S[y1])
        parent[x1]=y1

solve()
    if(query == 0)
        Input x and y
        px = find(x)
        py = find(y)
        if(px == py)
            print "Invalid query!"
        else
            set_union(px,py)
    else
        Input x.
        print find(x)

What is the time complexity of union and find ?

IMO, the time complexity of find is O(depth), so in worst case, if I am not using path-comression, the complexity turns out to be O(n). Since union also uses find, it also has the complexity of O(n). If instead we throw out the find from union and instead pass the parents of two sets to union, complexity of union is O(1). Please correct me, if I am wrong.

If path compression is applied, then what is the time complexity?

score 1 · Answer 1 · answered Apr 28 '19 at 14:36

Without path compression : When we use linked list representation of disjoint sets and the weighted-union heuristic, a sequence of m MAKE-SET, UNION by rank , FIND-SET operations takes place where n of which are MAKE-SET operations. So , it takes O(m+ nlogn).

With only path compression : The running time is theta( n + f * ( 1 + (log (base( 2 + f/n)) n ) ) ) where f is no of find sets operations and n is no of make set operations

With both union by rank and path compression: O( m*p(n )) where p(n) is less than equal to 4

Where can I find the proof of the complexity $\Theta( n + f * ( 1 + (log (base( 2 + f/n)) n ) ) )$? — Ander, Aug 09 '22 at 13:04

score 0 · Answer 2 · answered Dec 25 '15 at 07:02

0

The time complexity of both union and find would be linear if you use neither ranks nor path compression, because in the worst case, it would be necessary to iterate through the entire tree in every query.

If you use only union by ranks, without path compression, the complexity would be logarithmic.
The detailed solution is quite difficult to understand, but basically you wouldn't traverse the entire tree, because the depth of the tree would only increase if the ranks of the two sets are equal. So the iteration would be O(log*n) per query.

If you use the path compression optimization, the complexity would be even lower, because it "flattens" the tree, thus reducing the traversal. Its amortized time per operation is even faster than O(n), as you can read here.

answered Dec 25 '15 at 07:02

Emerson Leonardo Lucena

1
2

Can you please talk in terms of Array based approach, with reference to the pseudo code posted. I want to know the time complexity of that particular approach. – Naveen Dec 25 '15 at 07:33
Ok, so basically you want a union-find _with_ path compression _but without_ union by ranks. Its complexity is O(log*n) per query, because even in the worst case, the new parent of a dish is the root of the tree, so in the next query the path would be already compressed. You can see it [here](https://books.google.ru/books?id=oK3UWxg_UhsC&pg=PA224&lpg=PA224&dq=%22without+union+by+rank%22&source=bl&ots=iGGO7XKp6L&sig=U76thEmxJZ4oadMQPRbW8sA9tDI&hl=ru&ei=PDeFS5OFNJPw-QbQxIyeAQ&sa=X&oi=book_result&ct=result#v=onepage&q=%22without%20union%20by%20rank%22&f=false) – Emerson Leonardo Lucena Dec 25 '15 at 08:05

Time Complexity of Array based Disjoint-Set data structure

2 Answers2