0

So far I have a list of how many times each hash value is repeated. However, when I sum all the repeated values together it ends up just being the total number of values in the dataset. So how would I count the number of collisions in the data set?

I tried suming all the repeated values together

Marcus Müller
  • 34,677
  • 4
  • 53
  • 94
  • Think about this: what is the definition of a collision? For which values of individual hash occurrence is there *not* a collision? This is easier than you probably think. – Marcus Müller Oct 28 '22 at 00:24
  • _However, when I sum all the repeated values together_ I don't understand the use of sum here. If I have two colliding values, say 45 and 45, the sum is 90, but there are still only two collisions. (Or perhaps only one collision, depending on how you count.) – John Gordon Oct 28 '22 at 00:27
  • For example, I know there are 23 values at hash key 10. Would this mean there are 23 collisions at hash key 10? – Rajesh Kumar Oct 28 '22 at 00:35
  • I would say 22 collisions, since it feels wrong to count the very first one as a collision. But then I don't do much work in this area, so maybe it's an accepted convention to count them all? Not sure. – John Gordon Oct 28 '22 at 00:46
  • so lets say I have a dictionary of hash keys and number of values mapped to each key as so: {10: 90, 11: 77, 12: 59, 13: 51, 9: 45, 14: 38, 16: 33, 15: 32, 18: 25, 19: 25, 8: 24, 17: 23, 7: 14, 20: 12, 22: 10, 21: 9, 23: 3, 6: 2, 30: 1, 24: 1, 5: 1}. How would I calculate the total number of collisions from this dictionary. – Rajesh Kumar Oct 28 '22 at 00:50
  • Assuming you want to count `10: 90` as 90 collisions, it would simply be the sum of all the values in that dict. i.e. `sum(mydict.values())` – John Gordon Oct 28 '22 at 00:53

0 Answers0