Deepcopy somehow changes set.difference behaviour

Question

Lately I noticed that copy.deepcopy changes the behavior of set.difference.

I have a class called Foo that implements hash returning the id attr. I create two instances of this class each with a different id, add them to an array, and make a deepcopy of this array. I then perform a set difference operation between the original array and its deepcopy. Surprisingly, the resulting set difference is not empty.

Here's the code I used to reproduce the issue:

import copy


class Foo:
    def __init__(self, id: int) -> None:
        self.id = id

    def __hash__(self) -> int:
        return self.id

    def __repr__(self) -> str:
        return f"<Foo id={self.id}>"


arr = [Foo(1), Foo(2)]
arr_copy = copy.deepcopy(arr)

print(f"hash(arr) -> {', '.join(f'{hash(x)}' for x in arr)}")
print(f"hash(arr_copy) -> {', '.join(f'{hash(x)}' for x in arr_copy)}")

arr_diff = set(arr) - set(arr)  # Empty set
arr_copy_diff = set(arr) - set(arr_copy)  # Set with both Foo's???

print(f"{arr_diff=}")
print(f"{arr_copy_diff=}")

assert arr_diff == arr_copy_diff

~~Why does copy.deepcopy affect the behavior of set.difference in this way?~~

EDIT:

Solution is implement __eq__ as well in Foo's class.

define \_\_eq\_\_ It checks the hash but also in case of a hash collision it checks if the two elements are actually equal. — Kenny Ostrom, Apr 03 '23 at 14:42
Oh I didn't know that, thanks for the help. Now it works as expected. — Martim Martins, Apr 03 '23 at 15:34

Deepcopy somehow changes set.difference behaviour

0 Answers0