1

Is

public override int GetHashCode()
{
    return Word.GetHashCode();
}

Really the same to

public override int GetHashCode()
{
    return (int) Word.GetHashCode() * 7;
}

regarding the uniqueness?

Word is of type String

EDIT: I forgot to say, which one is better to implement in the program, Option 1 or 2?

Hendrik Breezy
  • 128
  • 1
  • 12
  • 3
    Since hash codes are neither required nor could be be unique, the answer to your question is "yes", in the sense that both implementations produce non-unique hash codes. – Sergey Kalinichenko Sep 29 '16 at 16:33
  • 5
    Any collisions with `Word.GetHashCode()` are still going to collide after multiplying by 7. Also the cast is pointless. – juharr Sep 29 '16 at 16:34
  • Extending juharr's comment, if `World.GetHashCode()` produces 6 for worldA and worldB, then `World.GetHashCode() * 7` produces 42 for both worldA and worldB... – D. Ben Knoble Sep 29 '16 at 16:37
  • What do you mean exactly? if you get two different unique results for two Words in the first you will get two different unique results from the second. Likewise if you get two identical results from two Words in the first then the second will also produce two identical results. This seems somewhat obvious from looking at the code though so it feels like there is something more to you question than this that I think could do with being elaborated on. – Chris Sep 29 '16 at 16:42
  • Is `(3 * 7) == (3 * 7)` really the same as `3 == 3`? – 15ee8f99-57ff-4f92-890c-b56153 Sep 29 '16 at 17:00
  • @Chris It's pretty trivial to prove that identical results will stay identical. It's slighly less trivial to prove whether or not different results have an increased chance of collision with the second approach (if you also make the operation `unchecked`). See Das' answer for details. – Servy Sep 29 '16 at 17:01
  • @Servy: True I guess. I have a maths degree so I forget that what is obvious for me may not be obvious for others. :) – Chris Sep 29 '16 at 19:02
  • @Chris No, you said it's obvious that equal results will stay equal. That's not the hard part. The hard part is determining if different values stay different. It's actually a non-trivial proof. – Servy Sep 29 '16 at 20:11

2 Answers2

2

It is clear that any collisions in the Word.GetHashCode() implementation would result in a collision in (int) Word.GetHashCode() * 7 implementation, because multiplying identical numbers produce identical results.

A more interesting question is if non-colliding hash codes from the first implementation would result in collisions within the second implementation. It turns out that the answer is "no", because the range of int and 7 are mutually prime numbers. Hence, multiplication produces a unique mapping after dropping the overflow.

You can run a small test with two-byte hash codes to see what happens:

const int Max = 1<<16;
var count = new int[Max];
for (int i = 0 ; i != Max ; i++) {
    count[(i * 7) & (Max-1)]++;
}
var notOne = 0;
for (int i = 0 ; i != Max ; i++) {
    if (count[i] != 1) {
        notOne++;
    }
}
Console.WriteLine("Count of duplicate mappings found: {0}", notOne);

This program maps i, the hash code value, to 7 * i modulo 216, and verifies that each number from the range is produced exactly once.

Count of duplicate mappings found: 0

Demo.

If you replace 7 with an even number, the result is going to be very different. Now multiple hash codes from the original set would be mapped to a single hash code in the target set. You can understand this intuitively if you recall that multiplying by an even number always makes the least significant bit to be zero. Hence, some information is lost, depending on how many times the even number can be divided by two.

which one is better?

There is no difference.

Note: The above assumes that you are ignoring integer overflow.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
0

Since you aren't running the code in an unchecked context, then the latter will throw an exception any time there is overflow, which is reasonably likely (6/7 of the hash range will throw, so a generally evenly distributed hash code has a ~6/7 chance of throwing an exception).

Servy
  • 202,030
  • 26
  • 332
  • 449
  • Looking at https://msdn.microsoft.com/en-gb/library/a569z7k8.aspx it says "Expressions that contain non-constant terms are unchecked by default at compile time and run time" so does that not mean it would be unchecked if not explicitly checked? I'll admit I've not really played around with anything where I've needed to worry about checked/unchecked so I may quite easily be wrong... – Chris Sep 29 '16 at 18:46
  • 1
    @Chris C# compiler and VS default to unchecked. – Cory Nelson Sep 29 '16 at 18:51