6

I have this dataclass:

from dataclasses import dataclass, field
from typing import List

@dataclass
class Person:
    name: str
    dob: str
    friends: List['Person'] = field(default_factory=list, init=False)

name and dob are immutable and friends is mutable. I want to generate a hash of each person object. Can I somehow specify which field to be included and excluded for generating the __hash__ method? In this case, name and dob should be included in generating the hash and friends shouldn't. This is my attempt but it doesn't work

@dataclass
class Person:
    name: str = field(hash=True)
    dob: str = field(hash=True)
    friends: List['Person'] = field(default_factory=list, init=False, hash=False)
>>> hash(Person("Mike", "01/01/1900"))
Traceback (most recent call last):
  File "<pyshell#43>", line 1, in <module>
    hash(Person("Mike", "01/01/1900"))
TypeError: unhashable type: 'Person'

I also can't find a way to set name and dob to be frozen. And I'd refrain from setting unsafe_hash to True, just by the sound of it. Any suggestions?

Also, is what I'm doing considered good practice? If not, can you suggest some alternatives?

Thank you

Edit: This is just a toy example and we can assume that the name and dob fields are unique.

Edit: I gave an example to demonstrate the error.

Mike Pham
  • 437
  • 6
  • 17
  • 1
    It doesn't sound like these objects should be hashable at all. – user2357112 May 11 '20 at 22:49
  • Well if you remove friends, then they should be hashable, so that's what I'm asking. How do I exclude friends from the hash? And should I? – Mike Pham May 11 '20 at 22:50
  • 1
    Say I've got my friend Mike here, born March 3rd, 1972, and my friend Other Mike, coincidentally also born March 3rd, 1972. Does it make any sense to say that these two people are equal? If not, what good is this proposed hash? – user2357112 May 11 '20 at 22:51
  • Really, it probably doesn't make sense to generate either `__eq__` or `__hash__`. You can pass `eq=False` to the `dataclass` decorator to avoid generating either method, and let the class inherit the default identity-based `__eq__` and `__hash__` from `object`. – user2357112 May 11 '20 at 22:58
  • I added an edit to respond to your question. We can assume that those fields are unique for all instances. this is not my real dataclass in my code and is just a toy example made up purely for this question. Sorry for the confusion – Mike Pham May 11 '20 at 23:07
  • 1
    This still sounds like you should inherit `__hash__` and `__eq__` from `object`. – user2357112 May 11 '20 at 23:18
  • @juanpa.arrivillaga 1. I added an edit to show the error. Sorry for assuming the error is obvious. 2. Yes I've read and understood the doc, and Guido himself says that using unsafe hash is 'fishy' https://bugs.python.org/issue32929. However I feel like this doesn't apply to my case. – Mike Pham May 11 '20 at 23:18
  • Here is the documentation: "Although not recommended, you can force `dataclass()` to create a `__hash__()` method with `unsafe_hash=True`. This might be the case if your class is logically immutable but can nonetheless be mutated. This is a specialized use case and should be considered carefully." Does this apply to your use-case? Do you plan on mutating your objects? If *so* then your object should not be hashable, unless you are using identity to give you equality. – juanpa.arrivillaga May 11 '20 at 23:21
  • Because you are asking questions *that are answered in the documentation*. Because *as stated in the docs* to get the behavior you seem to desire, hashing on those two fields, while not freezing your class, requires using `unsafe_hash=True`. Now, this may or may not be an issue. The biggest issue is that `__eq__` will not be consistent with `__hash__`, which it must be to be a useful for hashing. This is because you are making `friends` a part of equality, you could pass `compare=False` to the `field` constructor to prevent that. And at least from a hashing perspective, this should be ok – juanpa.arrivillaga May 11 '20 at 23:38

1 Answers1

10

Just indicate that the friends field should not be taken in account when comparing instances with __eq__, and pass hash=True to field instances on the desired fields.

Then, pass the unsafe_hash=True argument to the dataclass decorator itself - it will work as you intend (mostly):

In case of hash, the language restriction is that if one instance compares equal with another (__eq__), the hash of of both must be equal as well. The implication in this case is that if you have two instances of the "same" person with the same "name" and "dob" fields, they will be considered equal, even if they feature different friends lists.

Other than that, this should work:


from dataclasses import dataclass, field
from typing import List

@dataclass(unsafe_hash=True)
class Person:
    name: str = field(hash=True)
    dob: str = field(hash=True)
    friends: List['Person'] = field(default_factory=list, init=False, compare=False, hash=False)

Then, remember to behave like a "consenting adult" and not change the "name" and "dob" fields of Person instances in any place, and you are set.

jsbueno
  • 99,910
  • 10
  • 151
  • 209