1

Preface

I understand that dicts/sets should be created/updated with hashable objects only due to their implementation, so when this kind of code fails

>>> {{}}  # empty dict of empty dict
Traceback (most recent call last):
  File "<input>", line 1, in <module>
TypeError: unhashable type: 'dict'

it's ok and I've seen tons of this kind of messages.

But if I want to check if some unhashable object is in set/dict

>>> {} in {}  # empty dict not in empty dict

I get error as well

Traceback (most recent call last):
  File "<input>", line 1, in <module>
TypeError: unhashable type: 'dict'

Problem

What is the rationale behind this behavior? I understand that lookup and updating may be logically connected (like in dict.setdefault method), but shouldn't it fail on modification step instead of lookup? Maybe I have some hashable "special" values that I handle in some way, but others (possibly unhashable) -- in another:

SPECIAL_CASES = frozenset(range(10)) | frozenset(range(100, 200))
...
def process_json(obj):
    if obj in SPECIAL_CASES:
        ...  # handle special cases
    else:
        ...  # do something else

so with given lookup behavior I'm forced to use one of the options

  • LBYL way: check if obj is hashable and only after that check if it is one of SPECIAL_CASES (which is not great since it is based on SPECIAL_CASES structure and lookup mechanism restrictions, but can be encapsulated in separate predicate),
  • EAFP way: use some sort of utility for "safe lookup" like

    def safe_contains(dict_or_set, obj):
        try:
            return obj in dict_or_set
        except TypeError:
            return False
    
  • use list/tuple for SPECIAL_CASES (which is not O(1) on lookups).

Or am I missing something trivial?

Azat Ibrakov
  • 9,998
  • 9
  • 38
  • 50
  • Possible duplicate of [Asking "is hashable" about a Python value](https://stackoverflow.com/questions/3460650/asking-is-hashable-about-a-python-value) – quamrana Feb 22 '19 at 15:55
  • 1
    @quamrana: this post is not about "how to find if object is hashable", I know how to do that, it's about dict/set lookup mechanism quirks – Azat Ibrakov Feb 22 '19 at 15:57
  • It sounds like when python sees: `obj in dict_or_set`, then the first thing it tries is `hash(obj)`. – quamrana Feb 22 '19 at 16:02

2 Answers2

1

As you have no doubt realized, sets and dicts are very similar in their inner workings. Basically the concept is that you have key - value pairs (or just keys with a set), and the key must never change (immutable). If an object were mutable, the hash would loose it's meaning as a unique identifier of the underlying data. If you can't tell if an object is unique or not, the meaning of a set of unique keys looses it's key property of uniqueness. This is why mutable types are disallowed in sets and as the keys of a dict. With your example: {} in {} # empty dict not in empty dict I think you have a slight misunderstanding, as dict.__contains__ only checks the keys of the dict, not the values. Since you can never have a dict as a key (because it's mutable) this is invalid.

Aaron
  • 10,133
  • 1
  • 24
  • 40
  • I know everything that you've written, that `key in dct` checks if the `key` is in `dct`, not in values, my question is why it raises `TypeError`, not `False` (since unhashable objects could not be in `dict` keys/`set` elements) and how to deal with that in general case – Azat Ibrakov Feb 22 '19 at 16:07
  • @AzatIbrakov given that logic, should this be valid? `a = [1,2,3]; print(a['key'])`. The issue I see here is one that's totally moot in a statically typed language, and a common one in dynamic languages. There are some operations that require the type to be known. I see this as a normal part of programming, and not an error with the language. – Aaron Feb 22 '19 at 16:10
  • 1
    @Aaron: moot not mute. – quamrana Feb 22 '19 at 16:12
  • @quamrana ty.. haven't digested my morning coffee yet :P – Aaron Feb 22 '19 at 16:14
  • `__getitem__` and `__contains__` are totally different methods, so I don't think your example is somehow relevant – Azat Ibrakov Feb 22 '19 at 16:14
  • @AzatIbrakov I disagree, as `dict.__contains__` could be written quite accurately as `def contains(dict, key): try: dict[key]; except KeyError: return False; else: return True` By the principle of only catching the specific exception you need, this won't catch the error raised from not being able to get the hash of `key` – Aaron Feb 22 '19 at 16:23
  • This type of question has actually been asked many times before [example](https://stackoverflow.com/questions/175532/should-a-retrieval-method-return-null-or-throw-an-exception-when-it-cant-prod), and there are many opinions on the matter. Python chooses to throw an exception which favors the "don't fail silently" principle, while Lua will return `nil` for things of this nature (Lua throws very few exceptions in fact beyond syntax errors). – Aaron Feb 22 '19 at 16:27
1

I've found this issue on Python bug tracker. Long story short:

if

>>> set([1,2]) in {frozenset([1,2]): 'a'}

returned False it will be in some way counter-intuitive since values are equal

>>> set([1,2]) == frozenset([1,2])
True

So I think I'll write & use proper utilities where situation like this can possibly occur.


About the roots of the error: in CPython repo dict___contains__ function (which is a dict.__contains__ method implementation) calls PyObject_Hash function (which corresponds to hash function) -> for unhashable objects (like {} in our first case) calls PyObject_HashNotImplemented function -> generates this error.

Azat Ibrakov
  • 9,998
  • 9
  • 38
  • 50