1

Suppose that I have two sets of items, and a function to check the equivalence of two items (not strict equality so that one item may be equivalent to multiple items in the other set), I want to determine whether there is a one-to-one correspondence such that the equivalence holds for each of the pairs.

Is there any established/optimal solution for this problem?


This problem comes originally from determining whether two C union types are compatible, for which the standard requires such correspondence, however things get tricky as union members can be anonymous so the equivalent item for an item can have multiple possibilities. Currently I'm going with a naive approach but I wonder if there is any establish discussion/solution of it.

Hai Zhang
  • 5,574
  • 1
  • 44
  • 51
  • Can we assume equivalence (defined by say "≡") is transitive, i.e. if `a ≡ b` and `b ≡ c` then `a ≡ c`? – Bernhard Barker May 14 '17 at 19:52
  • @Dukeling In this C case, sadly, no. Because type A, B, C may be declared with the same tag, and if A and C are complete types but B is not, transtive property is broken. But I still wonder if it really is instead, is there any established discussion of it? – Hai Zhang May 14 '17 at 20:31
  • If it's transitive, the answers posted thus far should solve the problem just fine (I wouldn't know of any "established discussion"). If it's not transitive, I'm pretty sure it would be NP-complete (i.e. by assumption really, really slow to solve for the generic case), although I don't currently have a proof for that in mind. – Bernhard Barker May 14 '17 at 20:41

1 Answers1

1

One solution is to implement a hash function that has two properties:

  1. items that are equivalent have the same hash value
  2. items that are not equivalent rarely have the same hash value

Note that a perfect hash function would never generate the same hash value for items that are not equivalent.

Once you have a hash function, you can sort the lists by hash value. If your hash is perfect, it's trivial to check for one-to-one correspondence. If the hash function is less than perfect, when you find an n-to-n correspondence, the code will need to fall back to the brute force O(n^2) equivalence check for those n items.

Running time is the sum of the following tasks

  • O(N) to generate hash values
  • O(NlogN) to sort the list
  • M * O(n^2) for brute force checks (if the hash function is not perfect)

So overall running time with a perfect hash function is O(NlogN) compared to a running time of O(N^2) for a brute force comparison.

user3386109
  • 34,287
  • 7
  • 49
  • 68
  • Such a hash function is only possible if the OP's "equivalence" concept is truly an equivalence relation; but from his/her follow-up comment, it apparently is not. Furthermore, even if it *is* an equivalence relation, it may not be tractable to compute a meaningful hash code over it; see https://cstheory.stackexchange.com/questions/10702. – ruakh May 14 '17 at 23:36