How do I use union find data structure to group Strings?

Question

I have been using Union-Find (Disjoint set) for a lot of graph problems and know how this works. But I have almost always used this data structure with integers or numbers. While solving this leetcode problem I need to group strings and I am thinking of using Union-Find for this. But I do not know how to use this with strings. Looking for suggestions.

Suggestion: Instead of doing union-find on strings itself, do it on the index of the strings. — Shridhar R Kulkarni, Jan 21 '20 at 13:23
I eventually ended up doing that, but I am wondering if I had to use this structure with Strings, what should be the approach ? — Ezio, Jan 21 '20 at 15:26
[Here](https://codereview.stackexchange.com/questions/49186/generic-implementation-of-the-quick-union-algorithm-with-path-compression) is a generic implementation. — jrook, Jan 21 '20 at 16:42
You can implement the standard array-based union-find data structures using a hashtable instead of an array to map the nodes to their parents. The keys in the hashtable can then be any hashable type. — kaya3, Jan 21 '20 at 19:32

score 0 · Answer 1 · answered Nov 14 '21 at 08:33

TLDR: Use the same union find code you would for an integer/number, but use a hash map instead of an array to store the parent of each element in the union find. This approach generalizes to any data type that can be stored in hash map, not just strings, i.e. in the code below the two unordered maps could have something other than strings or ints as keys.

class UnionFind {
public: 
    string find(string s) { 
        string stringForPathCompression = s; 
        while(parent[s] != s) s = parent[s];
        
        // The following while loop implements what is known as path compression, which reduces the time complexity. 
        while(stringForPathCompression != s) { 
            string temp = parent[stringForPathCompression]; 
            parent[stringForPathCompression] = s; 
            stringForPathCompression = temp; 
        }
        return s; 
    }
    void unify(string s1, string s2) {
        string rootS1 = find(s1), rootS2 = find(s2); 
        if(rootS1 == rootS2) return; 
        
        // we unify the smaller component to the bigger component, thus preserving the bigger component. 
        // this is known as union by size, and reduces the time complexity
        if(sz[rootS1] < sz[rootS2]) parent[rootS1] = rootS2, sz[rootS2] += sz[rootS1]; 
        else parent[rootS2] = rootS1, sz[rootS1] += sz[rootS2];         
    }
private: 
    // If we were storing numbers in our union find, both of the hash maps below could be arrays
    unordered_map<string, int> sz; // aka size.
    unordered_map<string, string> parent; 
};

score 0 · Answer 2 · answered Apr 09 '22 at 05:20

0

Union Find doesn't really care what kind of data is in the objects. You can decide what strings to union in your main code, and then union find their representative values.

answered Apr 09 '22 at 05:20

JamesGreen31

1
3

How do I use union find data structure to group Strings?

2 Answers2