Dict or WeakKeyDictionary with identity equality -- wrap unhashable objects to check identity

Question

I want to use some objects as keys for some dict, which are either unhashable, or hashable but I want to overwrite their __eq__/__hash__ with the default object.__eq__/object.__hash__, i.e. namely a == b iff a is b.

(These objects could e.g. be numpy.ndarray, torch.Tensor, or other things, but I want to ask in general now.)

E.g.:

x = numpy.array([2,3,4])
d = {x: 5}

That would give the exception TypeError: unhashable type: 'numpy.ndarray'.

Or:

x = torch.Tensor([2,3,4])
d = weakref.WeakKeyDictionary()
d[x] = 5
print(d[x])

That would give the exception RuntimeError: Boolean value of Tensor with more than one value is ambiguous. (Which is quite misleading or unexpected. But I assume this is because it will do bool(__eq__(...)) internally. However, strangely, there is no such exception when I use a normal dict here. Why?)

I could write a custom object wrapper to solve this, like:

class WrappedObject:
  def __init__(self, orig):
    self.orig = orig
  
  def __eq__(self, other):
    return object.__eq__(self.orig, other.orig)

  def __ne__(self, other):
    return object.__ne__(self.orig, other.orig)

  def __hash__(self):
    return object.__hash__(self.orig)

That would solve the first case. Now I can write:

x = numpy.array([2,3,4])
d = {WrappedObject(x): 5}
print(d[WrappedObject(x)])  # 5

Is there sth like WrappedObject in some of the std lib?

The id function has similar behavior, although it just returns an int, and doesn't have a reference back to the original object. So for this example, I could write:

x = numpy.array([2,3,4])
d = {id(x): 5}
print(d[id(x)])  # 5

Note that this might be problematic! In case x gets freed later on, then another object could theoretically be created which has the same id, because id is only guaranteed to be unique during the lifetime, not after its lifetime. (Related question here, although the accepted answer exactly has this problem.)

This problem would not happen with WrappedObject, as the reference is always kept alive.

Is there sth which would wrap dict to automatically use sth like WrappedObject under the hood? I.e. specifically I want that for all the keys, it only uses their identity for equality.

Now consider my second case, specifically WeakKeyDictionary. I cannot use WrappedObject, because the WrappedObject itself is not kept alive, so all keys would immediately vanish:

x = torch.Tensor([2,3,4])
d = weakref.WeakKeyDictionary()
d[WrappedObject(x)] = 5
print(list(d.items()))  # prints []

The only real solution I currently see is to reimplement WeakKeyDictionary myself, using sth like WrappedRefObject. Is there a better solution? Does this already exist in the std lib or elsewhere?

Related: https://github.com/python/cpython/issues/88306 – Richard Hansen Nov 02 '22 at 21:17 — Richard Hansen, Nov 02 '22 at 21:17

Richard Hansen · Answer 1 · 2022-11-04T01:24:18.373

It's probably not too difficult to create a wrapper class using __getattr__, though it might be more robust to create your own mapping class like the following (untested):

import collections.abc
import weakref

class WeakKeyIdMap(collections.abc.MutableMapping):
    """Like weakref.WeakKeyDictionary except identity is used, not hash."""

    def __init__(self):
        self._values = {}
        self._keyrefs = {}

    def pop_id(self, idk):
        self._keyrefs.pop(idk)
        return self._values.pop(idk)

    def __getitem__(self, key):
        return self._values[id(key)]

    def __setitem__(self, key, value):
        idk = id(key)
        if idk not in self._keyrefs:
            self._keyrefs[idk] = weakref.ref(key, lambda: self.pop_id(idk))
        self._values[idk] = value

    def __delitem__(self, key):
        idk = id(key)
        self._keyrefs.pop(idk)
        del self._values[idk]

    def __iter__(self):
        for idk in self._values:  # Use self._values to preserve iteration order.
            key = self._keyrefs[idk]()
            if key is not None or key == id(None):
                yield key

    def __len__(self):
        return len(self._values)

Dict or WeakKeyDictionary with identity equality -- wrap unhashable objects to check identity

1 Answers1