6

In the code below the astuple function is carrying out a deep copy of a class attribute of the dataclass. Why is it not producing the same result as the function my_tuple?

import copy
import dataclasses


@dataclasses.dataclass
class Demo:
    a_number: int
    a_bool: bool
    classy: 'YOhY'

    def my_tuple(self):
        return self.a_number, self.a_bool, self.classy

class YOhY:
    def __repr__(self):
        return (self.__class__.__qualname__ + f" id={id(self)}")


why = YOhY()
print(why)  # YOhY id=4369078368

demo = Demo(1, True, why)
print(demo)  # Demo(a_number=1, a_bool=True, classy=YOhY id=4369078368)

untrupled = demo.my_tuple()
print(untrupled)  # YOhY id=4369078368

trupled = dataclasses.astuple(demo)
print(trupled)  # YOhY id=4374460064

trupled2 = trupled
print(trupled2)  # YOhY id=4374460064

trupled3 = copy.copy(trupled)
print(trupled3)  # YOhY id=4374460064

trupled4 = copy.deepcopy(trupled)
print(trupled4)  # YOhY id=4374460176

Footnote

As Anthony Sottile's excellent response makes clear this is the behavior coded into Python 3.7. Anyone expecting astuple to unpack the same way as collections.namedtuple will need to replace it with a method similar to Demo.my_tuple. The following code is less fragile than my_tuple because it will not need modification if the fields of the dataclass are changed. On the other hand it won't work if __slots__ are in use.

Both versions of the code pose a threat whenever a __hash__ method is present in the class or its superclasses. See the Python 3.7 documentation for unsafe_hash in particular the two paragraphs beginning 'Here are the rules governing implicit creation of a __hash__() method'.

def unsafe_astuple(self):
    return tuple([self.__dict__[field.name] for field in dataclasses.fields(self)])
lemi57ssss
  • 1,287
  • 4
  • 17
  • 36
  • 1
    At a guess, this is to avoid aliasing issues, so if some code mutates the values in the copied object, the change doesn't reflect in the 'original' object in the dataclass. – snakecharmerb Aug 12 '18 at 07:51
  • wonder will this API provide an argument so that we don't always deep copy. – Shihao Xu Jan 23 '21 at 20:49

1 Answers1

4

This seems to be an undocumented behaviour of astuple (and asdict it seems as well).

dataclasses.astuple(*, tuple_factory=tuple)

Converts the dataclass instance to a tuple (by using the factory function tuple_factory). Each dataclass is converted to a tuple of its field values. dataclasses, dicts, lists, and tuples are recursed into.

Here's the source:

def _asdict_inner(obj, dict_factory):
    if _is_dataclass_instance(obj):
        result = []
        for f in fields(obj):
            value = _asdict_inner(getattr(obj, f.name), dict_factory)
            result.append((f.name, value))
        return dict_factory(result)
    elif isinstance(obj, (list, tuple)):
        return type(obj)(_asdict_inner(v, dict_factory) for v in obj)
    elif isinstance(obj, dict):
        return type(obj)((_asdict_inner(k, dict_factory), _asdict_inner(v, dict_factory))
                          for k, v in obj.items())
    else:
return copy.deepcopy(obj)

The deepcopy here seems intentional, though probably should be documented.

anthony sottile
  • 61,815
  • 15
  • 148
  • 207
  • 2
    The behavior *is* documented; it's what the line "dataclasses, dicts, lists, and tuples are recursed into" is referring to. – jwodder Aug 12 '18 at 17:34
  • @jwodder it would make sense for an implementation from a consistency standpoint to deep copy all of the attributes -- that said it doesn't explicitly state that non-list/non-dict/non-tuples are deep copied. – anthony sottile Aug 12 '18 at 18:13