9

As far as I (thought to) know, a Dictionary is implemented as a hashtable, where the hash code is used to identify a bucket, which is then searched for the key.

In my opinion, this implies that the hash code of an object remains stable during a single run of my program (loosely speaking).

Now, here

http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx

I read

"A hash code is intended for efficient insertion and lookup in collections that are based on a hash table. A hash code is not a permanent value. For this reason: [...] Do not use the hash code as the key to retrieve an object from a keyed collection."

Can anybody explain to me what that is supposed to mean?

JohnB
  • 13,315
  • 4
  • 38
  • 65
  • 2
    Well, this is not an answer so much as it is an observation but `obj1.HashCode() == obj2.HashCode()` does not imply that `obj1.Equals(obj2)` (unless you implemented your `Equals` method to compare hashcodes...). – Tung Jun 03 '14 at 23:19
  • @Blam Perhaps I did not explain myself well enough. I'm not suggesting that HashCode is equivalent to Equals. I'm saying the opposite -- that they are different unless someone implemented their Equals to compare hashcodes. – Tung Jun 03 '14 at 23:23
  • 1
    @Tung I fail to see how your observation adds any value. – paparazzo Jun 03 '14 at 23:30
  • 1
    @Blam Maybe because his "observation" is what the documentation is trying to say? – Casey Jun 04 '14 at 00:59

5 Answers5

7

When the documentation talks about a "keyed collection", they do not mean the same thing as a Dictionary. For insight into what it actually means, note that there is actually a KeyedCollection base class: http://msdn.microsoft.com/en-us/library/ms132438%28v=vs.110%29.aspx

The key paragraph is this:

Unlike dictionaries, an element of KeyedCollection<TKey, TItem> is not a key/value pair; instead, the entire element is the value and the key is embedded within the value. For example, an element of a collection derived from KeyedCollection<String,String> (KeyedCollection(Of String, String) in Visual Basic) might be "John Doe Jr." where the value is "John Doe Jr." and the key is "Doe"; or a collection of employee records containing integer keys could be derived from KeyedCollection<int,Employee>. The abstract GetKeyForItem method extracts the key from the element.

So a keyed collection is a collection of objects along with a way of extracting a key from each one. Conceptually this is similar to a table in a database, where you can define a primary key which is a subset of the entire record.

So with this in mind, the answer becomes relatively clear. As others have said, equality of hash code does not imply equality of the objects. But keys in a keyed collection- like primary keys in a database table- should uniquely identify the exact object. So the possibility of hash collisions makes them inappropriate for this purpose.

Also, even in a Dictionary, there's an important difference between using objects as keys and using the same objects' hash codes as the key. If two objects have a hash collision but do not compare as equal, the Dictionary will still store them as two separate keys. That's why overriding GetHashCode to just return 1 is always valid (though obviously not good for performance). As a demonstration:

var dict = new Dictionary<MyClass, string>();
var hashDict = new Dictionary<int, string>();

dict[myObj1] = "One";
hashDict[myObj1.GetHashCode()] = "One";
dict[myObj2] = "Two";
hashDict[myObj2.GetHashCode()] = "Two";

Console.Out.WriteLine(dict[myObj1]);  //Outputs "One"
Console.Out.WriteLine(hashDict[myObj1.GetHashCode()]); //Outputs "Two"

(myObj1 and myObj2 are instances of MyClass which have the same hash code but do not compare as equal)

Ben Aaronson
  • 6,955
  • 2
  • 23
  • 38
  • I undelelted my answer to demonstrate that is what I thought to. But I don't think it is correct. If they meant KeyedCollection they would have said KeyedCollection and not keyed collection. HashSet and Dictionary are each examples of keyed collections. In general the key of a keyed collection should not be a hash is what I think they are saying. A key must be unique. A hash is not guaranteed to be unique. – paparazzo Jun 04 '14 at 02:51
  • @Blam I would hope that MS would keep their terminology consistent. The way I understand it "keyed collection" is a concept of a particular type of collection, and `KeyedCollection` is a class implementing that concept. You can see them using the phrase "keyed collection" here, for example: http://msdn.microsoft.com/en-us/library/dn169389%28v=vs.110%29.aspx http://msdn.microsoft.com/en-us/library/5z658b67%28v=vs.110%29.aspx . In both, it's being used specifically to refer to collections of items with embedded keys, rather than e.g. Dictionaries or HashSets. – Ben Aaronson Jun 04 '14 at 08:31
  • Yes, I should have stuck by my answer. – paparazzo Jun 04 '14 at 16:39
4

They might be talking about KeyedCollection.
In that case there is no purpose to using a hash as the key.
They key is supposed to be real value used by the class.

enter link description here

Like in the example

public class SimpleOrder : KeyedCollection<int, OrderItem>
{
    // The parameterless constructor of the base class creates a  
    // KeyedCollection with an internal dictionary. For this code  
    // example, no other constructors are exposed. 
    // 
    public SimpleOrder() : base() {}

    // This is the only method that absolutely must be overridden, 
    // because without it the KeyedCollection cannot extract the 
    // keys from the items. The input parameter type is the  
    // second generic type argument, in this case OrderItem, and  
    // the return value type is the first generic type argument, 
    // in this case int. 
    // 
    protected override int GetKeyForItem(OrderItem item)
    {
        // In this example, the key is the part number. 
        return item.PartNumber;
    }
}

PartNumber is a property of OrderItem (that happens to be an int)
You should never use the Hash OrderItem as the GetKeyForItem

paparazzo
  • 44,497
  • 23
  • 105
  • 176
3

I think what that particular item is saying is not to use the hash code as a key. For example, don't have a Dictionary<int, MyObject>, where the integer key is a hash code.

The primary reason for this is that two different items could have identical hash codes.

The safe way to use hash codes is ... not to use them directly. That is, it's very rare that you would write code that calls GetHashCode. If your code doesn't call GetHashCode, then your code can't save the values and you can't get into trouble depending on something you shouldn't depend on.

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
2

This explains it:

the .NET Framework does not guarantee the default implementation of the GetHashCode method, and the value this method returns may differ between .NET Framework versions and platforms, such as 32-bit and 64-bit platforms.

Every time you run your program in the same environment, you may always get the same hash codes, but if you run the same program on a different platform or different version of the .net framework, there is no guarantee that the hash codes will be the same.

LVBen
  • 2,041
  • 13
  • 27
1

The documentatinon means that the hashcode is not guaraneteed (or even likely) to be the same between successive runs of your program. Therefore if you try to use it as a key to an external data source such as a database or key value store, this will not be reliable. However using it as the base for an index into a table of buckets (in memory like in a dictionary) is exactly what it's designed for.

Kirk Woll
  • 76,112
  • 22
  • 180
  • 195
  • "the base for an index into a table of buckets" But that's not the same as using it as a key, is it? – Ben Aaronson Jun 04 '14 at 01:09
  • @BenAaronson, right, it's not the same as using it as an external key. I thought I made that clear, so I'm not quite following? – Kirk Woll Jun 04 '14 at 01:41
  • I don't mean a key an external collection, I mean it's not the same as using it as a key to, say, a dictionary in the program. – Ben Aaronson Jun 04 '14 at 01:44