Cyclic dependencies are a problem for deepcopy
when you:
- Have classes that must be hashed and contain reference cycles, and
- Don't ensure hash-related (and equality related) invariants are established at object construction, not just initialization
The problem is unpickling an object (deepcopy
, by default, copies custom objects by pickling and unpickling, unless a special __deepcopy__
method is defined) creates the empty object without initializing it, then tries to fill in its attributes one by one. When it tries to fill in node1
's attributes, it needs to initialize node2
, which in turn relies on the partially created node1
(in both cases due to the edge_dict
). At the time it's trying to fill in the edge_dict
for one Node
, the Node
it's adding to edge_dict
doesn't have its id
attribute set yet, so the attempt to hash it fails.
You can correct this by using __new__
to ensure invariants are established prior to initializing mutable, possibly recursive attributes, and defining the pickle
helper __getnewargs__
(or __getnewargs_ex__
) to make it use them properly. Specifically, change you class definition to:
class Node:
# __new__ instead of __init__ to establish necessary id invariant
# You could use both __new__ and __init__, but that's usually more complicated
# than you really need
def __new__(cls, p_id):
self = super().__new__(cls) # Must explicitly create the new object
# Aside from explicit construction and return, rest of __new__
# is same as __init__
self.id = p_id
self.edge_dict = {}
self.degree = 0
return self # __new__ returns the new object
def __getnewargs__(self):
# Return the arguments that *must* be passed to __new__
return (self.id,)
# ... rest of class is unchanged ...
Note: If this is Python 2 code, make sure to explicitly inherit from object
and change super()
to super(Node, cls)
in __new__
; the code given is the simpler Python 3 code.
An alternate solution that handles only copy.deepcopy
, without supporting pickling or requiring the use of __new__
/__getnewargs__
(which require new-style classes) would be to override deepcopying only. You'd define the following extra method on your original class (and make sure the module imports copy
), and otherwise leave it untouched:
def __deepcopy__(self, memo):
# Deepcopy only the id attribute, then construct the new instance and map
# the id() of the existing copy to the new instance in the memo dictionary
memo[id(self)] = newself = self.__class__(copy.deepcopy(self.id, memo))
# Now that memo is populated with a hashable instance, copy the other attributes:
newself.degree = copy.deepcopy(self.degree, memo)
# Safe to deepcopy edge_dict now, because backreferences to self will
# be remapped to newself automatically
newself.edge_dict = copy.deepcopy(self.edge_dict, memo)
return newself