0

Say we have a family of (not necessarily disjoint) sets S={S_{1}, S_{2}, S_{3}..S_{n}}. Say each set has size O(k). I need a hash-function h : S -> P (P could be anything) such that h(S_{i}) = h(S_{j}) IFF S_{i} and S_{j} intersect.

I want this hash function to be fast i.e. determining h(S_{i}) from S_{i} should be a fast operation, O(1), O(logk) etc

Can you tell me any of the following :

  1. Lower bound on complexity of such a function?
  2. Why such a function above cannot be achieved?
  3. Precise definition of such a hash function?
  4. Advice on why hash function might not be the best thing here, maybe something else is better?
  • What is it that you actually want to check from a business logic point of view? Maybe a [union-find data structure](https://en.wikipedia.org/wiki/Disjoint-set_data_structure) is what you're looking for? – Mushroomator Apr 24 '23 at 21:54
  • I'm aware of Union-Find, thank you for bringing that up; But like I said, what I want is, a quick way to check if given a family of sets, two query sets intersect. My guess is that there's a hash function based method to do this. The purpose of the question therefore is to learn the same, or a better approach. – user472374 Apr 25 '23 at 02:04
  • If you thought of your family of sets as a graph whose vertices were the sets and whose edges were the intersection relationship, then your hash function would basically say for each set (vertex) which transitive closure (connected component) it belonged to. Technically speaking, though, since you do not need the converse to be true, a constant function - h(S) = c - satisfies the other criteria (it just isn't very useful). – Patrick87 Apr 25 '23 at 13:31

0 Answers0