-3

It is well known that mutable types cannot be the key of a dictionary.

However if you were using say C++, then regular maps let you use vectors and arrays as map keys because regular maps are implemented as trees.

However, C++ also lets you use an array as the key of an unordered map, which is closer in spirit to a python dictionary because it hashes the keys as long as you provide the hash function for types it doesn't know how to hash.

So I wanted to know if Python would let you do the same as long as you provide an __hash__ method.

In [1]: b = {}

In [2]: class hlist(list):
   ...:     def __hash__(self):
   ...:         temp = []
   ...:         for item in self:
   ...:             print item
   ...:             temp.append(item)
   ...:         return hash(tuple(temp))
   ...:

In [3]: a = hlist([1,2,3,4])

In [4]: c = hlist([1,2,3,4])

In [5]: b[a] = "car"
1
2
3
4

In [6]: b[c]
1
2
3
4
Out[6]: 'car'

In [7]: a.append(5)

In [8]: b[c]
1
2
3
4
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-8-013e994efe63> in <module>()
----> 1 b[c]

KeyError: [1, 2, 3, 4]

I added the print inside the __hash__ to figure out what is being hashed and when the function is invoked.

Right before the KeyError is thrown, the contents of c are printed, indicated that c was just hashed. Now shouldn't it just check if this hash value if the hash value of one of the keys? Why does it throw a key error?

If it is also hashing all the keys one by one to figure out if one of them hash the same hash value as the query shouldn't the code below work?

In [11]: b[hlist([1,2,3,4,5])]
1
2
3
4
5
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-11-09593553a69b> in <module>()
----> 1 b[hlist([1,2,3,4,5])]

KeyError: [1, 2, 3, 4, 5]

If you were determined to have a mutable key with a semi robust hashing function similar to cpp is it possible?

martineau
  • 119,623
  • 25
  • 170
  • 301
Srini
  • 1,619
  • 1
  • 19
  • 34
  • 2
    C++ has `const`. The keys of a C++ map are `const`. – user2357112 Mar 27 '18 at 23:09
  • [`__hash__()` isn't enough.](https://docs.python.org/3/reference/datamodel.html#object.__hash__) – Ignacio Vazquez-Abrams Mar 27 '18 at 23:11
  • @user2357112 that makes sense! @IgnacioVazquez-Abrams aren't the other things like `__eq__` defined as part of being a subclass of `list` ? – Srini Mar 27 '18 at 23:13
  • 2
    Yes, `__eq__` is defined for your `hlist` class. But `c` doesn't compare equal to `a`, so that's why you get a KeyError. (Keep in mind that the hlist stored in the dict is `a`, so every key you use is checked for equality against `a`.) – Aran-Fey Mar 27 '18 at 23:18
  • 1
    Does [this](https://stackoverflow.com/questions/327311/how-are-pythons-built-in-dictionaries-implemented) answer your question? I'm not really sure what you mean by "is it possible?". It's pretty clear that you _can_ have mutable keys with an unstable hash, and that it's a bad idea, isn't it? – Aran-Fey Mar 27 '18 at 23:23
  • Yes, I think I understand now, thanks :) – Srini Mar 27 '18 at 23:25

1 Answers1

0

How are dicts stored in memory? (simplified version)

  • for each key in the dict, hash is calculated and then both key and value are stored in a place defined by the hash
  • if multiple keys have the same hash (or have hashes pointing to the same storage destination), there will be a list of key-value pairs in that place

How are dict values read from memory? (simplified version)

  • the hash of the key is calculated and location in memory is calculated from that hash
  • key-value pairs are read from that location one by one and compared to the sought key using == operator

Conclusion

To for a key (call it key1) to be found in the dict, the dict should contain a key (call it key2), for which hash(key1) == hash(key2) and key1 == key2.

So why are mutable keys a bad idea?

Because hash(key) is calculated when key is written into the dict and it matches the value of key at that point in time, but if key is mutable, and you mutate it while it is in the dict, dict will not recalculate hash(key), so it will no longer be possible to find the key.

zvone
  • 18,045
  • 3
  • 49
  • 77