3

I've implemented a disjoint set data structure for my program and I realized I need to iterate over all equivalence classes.

Searching the web, I didn't find any useful information on the best way to implement that or how it influences complexity. I'm quite surprised since it seems like something that would be needed quite often.

Is there a standard way of doing this? I'm thinking about using a linked list (I use C so I plan to store some pointers in the top element of each equivalence class) and updating it on each union operation. Is there a better way?

martinkunev
  • 1,364
  • 18
  • 39
  • Are you aiming to iterate over the *representatives* from the equivalence classes, or the *elements* of the equivalence classes? – templatetypedef Mar 16 '20 at 20:37
  • @templatetypedef Currently I need just 1 representative from each class. I think in the fugure I may need to iterate over all elements in a class (given a representative). – martinkunev Mar 16 '20 at 20:54
  • 1
    If you only have to do it at the end, then if just iterating over all the sets and picking out the roots won't do, the usual procedure would be to use your disjoint set structure to build something else that you can easily iterate, like a map from set -> list. – Matt Timmermans Mar 16 '20 at 22:19

2 Answers2

2

You can store pointers to top elements in hash-based set or in any balanced binary search tree. You only need to delete and add elements - both operations in these structures run in O(1)* and in O(logN) respectively. In linked list they run in O(N).

ardenit
  • 3,610
  • 8
  • 16
  • I was thinking more in the lines of storing pointer to next class in the top element of each class. This way union should be inverse_ackerman(N). – martinkunev Mar 16 '20 at 20:56
  • 2
    I think the OP's idea is to thread the linked list through the elements rather than to have a separate linked list of the representatives. That would allow for O(1) splice-outs rather than O(n) search-and-removes. – templatetypedef Mar 16 '20 at 20:58
1

Your proposal seems very reasonable. If you thread a doubly-linked list through the representatives, you can splice out an element from the representatives list in time O(1) and then walk the list each time you need to list representatives.

@ardenit has mentioned that you can also use an external hash table or BST to store the representatives. That's certainly simpler to code up, though I suspect it won't be as fast as just threading a linked list through the items.

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065