Replace attributes in Data Class objects

Question

I'd like to replace the attributes of a dataclass instance, analogous to namedtuple._replace(), i.e. making an altered copy of the original object:

from dataclasses import dataclass
from collections import namedtuple

U = namedtuple("U", "x")

@dataclass
class V:
    x: int

u = U(x=1)
u_ = u._replace(x=-1)
v = V(x=1)

print(u)
print(u_)
print(v)

This returns:

U(x=1)
U(x=-1)
V(x=1)

How can I mimic this functionality in dataclass objects?

wim · Accepted Answer · 2022-05-11T18:18:06.210

53

The dataclasses module has a helper function for field replacement on instances (docs)

from dataclasses import replace

Usage differs from collections.namedtuple, where the functionality was provided by a method on the generated type (Side note: namedtuple._replace is documented/public API, using an underscore on the name was called a "regret" by the author, see link at end of answer).

>>> from dataclasses import dataclass, replace
>>> @dataclass
... class V:
...     x: int
...     y: int
...     
>>> v = V(1, 2)
>>> v_ = replace(v, y=42)
>>> v
V(x=1, y=2)
>>> v_
V(x=1, y=42)

For more background of the design, see the PyCon 2018 talk - Dataclasses: The code generator to end all code generators. The replace API is discussed in depth, along with other design differences between namedtuple and dataclasses, and some performance comparisons are shown.

edited May 11 '22 at 18:18

answered May 13 '18 at 19:21

wim

338,267
99
616
750

1

It seems like someone discovered issues with `init` and post-init hooks in dataclasses, and instead of revisiting the design and resolving complexity, they chose to solve it just by adding complexity. The real story is that if you are leveraging dataclasses in some way where they aren't treated as completely logic-free containers, you're using them wrong and you need a different tool. `deepcopy` of a dataclass, for example, should have absolutely zero risk of doing anything besides simplistic deepcopy of each member attribute, so there is no least surprise issue for the user. – ely Apr 30 '21 at 13:01
8

`replace` is pretty useful when having (pseudo-)immutable objects, such as frozen dataclasses. They are very common in functional programming where you don't mutate the original object, but instead return a new object with all fields equal except the ones you `replace`. – hugovdberg Jul 30 '21 at 12:12
2

I sort of want to have replacec as `method` rather than a function, because it seems like it embeds the assumption that something is a dataclass in calling code. – Att Righ May 11 '22 at 11:03
@ely I disagree with your claim that dataclasses must be completely logic-free containers. Consider a frozen dataclass modeling a 2D triangle, for example, where you might want a method for computing the area, and validation that the triangle is not degenerate (no collinear vertices). What better place to put the validation than in a post-init hook? – wim Jun 20 '22 at 21:00
Is there anything wrong with this? @AttRigh `def replace(self, **kwargs): return dataclasses.replace(self, **kwargs)` – Geoffrey Negiar Sep 21 '22 at 02:14
1

@wim and @hugovdberg coming back to this a long time later. Wim, validation logic like that ought never be internal to the container itself. Make a helper function, `validate_triangle` - don't bloat a simple record object with responsibilities for self processing. If you need it for some (dubious) OO reason, then use a class. Almost _any_ place would be better for that validation logic than as part of obtuse instance creation of something any user of such an object is likely to assume is just a basic data record. It reminds of `@property`, which is also frequently a bad choice & overused. – ely Oct 07 '22 at 18:13
In other words, if you need a "smart object" or are worried about invalid states being unrepresentable through hidden magic transparent to a user of the structures, that's exactly a situation where dataclass is not for you. In fact, seeing `@dataclass` would be like a bright billboard advertising "No funny business going on behind the scenes here! Just a super simple record type!" – ely Oct 07 '22 at 18:14
@ely That is how you see it, but I don't read a reason. For me, dataclasses are a power- and beautiful way to avoid boiler plate code around initialization, access and contracts of attributes. – matheburg Feb 06 '23 at 02:17

score 0 · Answer 2 · answered Dec 16 '20 at 06:59

I know the question is about dataclass, but if you're using attr.s instead then you can use attr.evolve instead of dataclasses.replace:

import attr

@attr.s(frozen=True)
class Foo:
    x = attr.ib()
    y = attr.ib()

foo = Foo(1, 2)
bar = attr.evolve(foo, y=3)

score 0 · Answer 3 · answered Mar 23 '23 at 21:01

Just using replace will have reference pointer to previous mutable objects, hence two instances of a dataclass will share a state

So try something like this:

@dataclasses.dataclass(frozen=True)
class MyDataClass:
    mutable_object: list
    val: int
    
    def copy(self, **changes):
        return dataclasses.replace(deepcopy(self), **changes)

data = MyDataClass([], 1)
data2 = data.copy(val=2)
assert data.mutable_object != data2.mutable_object

score -1 · Answer 4 · answered Apr 18 '18 at 20:34

-1

dataclass is just syntactic sugar for the automatic creation of a special __init__ method and a host of other "boilerplate" methods based on type-annotated attributes.

Once the class is created, it is like any other, and its attributes can be overwritten and instances can be copied, e.g.

import copy

v_ = copy.deepcopy(v)
v_.x = -1

Depending on what the attributes are, you may only require copy.copy.

answered Apr 18 '18 at 20:34

ely

74,674
34
147
228

3

–1 It is incorrect to use a copy/deepcopy for field replacement on dataclasses. In some complex use cases (e.g. init/post_init hooks), data may not be handled correctly. The better way is to use `dataclasses.replace()` function. – wim May 13 '18 at 19:23
@wim revisiting this a bit later I think my disagreement about `replace` is even stronger after having dealt with this feature in production systems for a while. I added some comments to your answer for a different take. I totally respect your POV is different, but I wanted to highlight a dissenting opinion because some users may feel like I do, and it could inform them on ways to use convention based restrictions of `dataclass` that allow for avoiding the bad code smell of `replace`. – ely Apr 30 '21 at 13:04
The suggested approach of making a copy and then setting attributes **does not work at all in the case of frozen dataclasses**, which are pretty common when you want hashable instances that can be stored inside sets or used as dictionary keys. – wim Jun 20 '22 at 20:57
A frozen dataclass in Python is just a fundamentally confused concept. It could still have mutable attributes like lists and so on. Using such a thing for dict keys is a hugely bad idea. – ely Apr 14 '23 at 14:46

score -1 · Answer 5 · edited Mar 31 '21 at 15:35

-1

@dataclass()
class Point:
    x: float = dataclasses.Field(repr=True, default=0.00, default_factory=float, init=True, hash=True, compare=True,
                                 metadata={'x_axis': "X Axis", 'ext_name': "Point X Axis"})
    y: float = dataclasses.Field(repr=True, default=0.00, default_factory=float, init=True, hash=True, compare=True,
                                 metadata={'y_axis': "Y Axis", 'ext_name': "Point Y Axis"})

Point1 = Point(13.5, 455.25)
Point2 = dataclasses.replace(Point1, y=255.25)

print(Point1, Point2)

edited Mar 31 '21 at 15:35

Tomerikoo

18,379
16
47
61

answered Mar 31 '21 at 15:23

adg08101

1

2

Welcome to StackOverflow! Can you add some text to your answer to explain how it solves the problem, and maybe also point our how it adds to the other answers already provided? – joanis Mar 31 '21 at 19:07

Replace attributes in Data Class objects

5 Answers5