I have been using Union-Find (Disjoint set) for a lot of graph problems and know how this works. But I have almost always used this data structure with integers or numbers. While solving this leetcode problem I need to group strings and I am thinking of using Union-Find for this. But I do not know how to use this with strings. Looking for suggestions.
Asked
Active
Viewed 1,409 times
0
-
Suggestion: Instead of doing union-find on strings itself, do it on the index of the strings. – Shridhar R Kulkarni Jan 21 '20 at 13:23
-
I eventually ended up doing that, but I am wondering if I had to use this structure with Strings, what should be the approach ? – Ezio Jan 21 '20 at 15:26
-
1[Here](https://codereview.stackexchange.com/questions/49186/generic-implementation-of-the-quick-union-algorithm-with-path-compression) is a generic implementation. – jrook Jan 21 '20 at 16:42
-
1You can implement the standard array-based union-find data structures using a hashtable instead of an array to map the nodes to their parents. The keys in the hashtable can then be any hashable type. – kaya3 Jan 21 '20 at 19:32
2 Answers
0
TLDR: Use the same union find code you would for an integer/number, but use a hash map instead of an array to store the parent of each element in the union find. This approach generalizes to any data type that can be stored in hash map, not just strings, i.e. in the code below the two unordered maps could have something other than strings or ints as keys.
class UnionFind {
public:
string find(string s) {
string stringForPathCompression = s;
while(parent[s] != s) s = parent[s];
// The following while loop implements what is known as path compression, which reduces the time complexity.
while(stringForPathCompression != s) {
string temp = parent[stringForPathCompression];
parent[stringForPathCompression] = s;
stringForPathCompression = temp;
}
return s;
}
void unify(string s1, string s2) {
string rootS1 = find(s1), rootS2 = find(s2);
if(rootS1 == rootS2) return;
// we unify the smaller component to the bigger component, thus preserving the bigger component.
// this is known as union by size, and reduces the time complexity
if(sz[rootS1] < sz[rootS2]) parent[rootS1] = rootS2, sz[rootS2] += sz[rootS1];
else parent[rootS2] = rootS1, sz[rootS1] += sz[rootS2];
}
private:
// If we were storing numbers in our union find, both of the hash maps below could be arrays
unordered_map<string, int> sz; // aka size.
unordered_map<string, string> parent;
};

Hisham Hijjawi
- 1,803
- 2
- 17
- 27
0
Union Find doesn't really care what kind of data is in the objects. You can decide what strings to union in your main code, and then union find their representative values.

JamesGreen31
- 1
- 3