0

Suppose we have minhash signatures for two sets and we want to calculate the Jaccard similarity of the two sets. We have:

-> S1 S2

h1 0 1

h2 1 2

h3 2 0

h4 3 3

S1 and S2 have the same signatures in different orders. Is the Jaccard similarity 1/8 or 1(approximately)?

haky_nash
  • 1,040
  • 1
  • 10
  • 15

1 Answers1

0

These are different hash functions, thus h2(S1) == h1(S2) means nothing. There is no sense in comparing values of different hashings. So to directly answer - similarity here is 0 (no collisions), so not 1/8 nor 1.

lejlot
  • 64,777
  • 8
  • 131
  • 164