Extend dataclass' repr programmatically

Question

Suppose I have a dataclass with a set method. How do I extend the repr method so that it also updates whenever the set method is called:

from dataclasses import dataclass
@dataclass
class State:
    A: int = 1
    B: int = 2
    def set(self, var, val):
        setattr(self, var, val)

Ex:

In [2]: x = State()

In [3]: x
Out[3]: State(A=1, B=2)

In [4]: x.set("C", 3)

In [5]: x
Out[5]: State(A=1, B=2)

In [6]: x.C
Out[6]: 3

The outcome I would like

In [7]: x
Out[7]: State(A=1, B=2, C=3)

Do you also need to extend the `__init__` method so that `State(A=1, B=2, C=3)` is a valid constructor call? Or is it OK if your `__repr__` method gives you a string you can't `eval` to an identical object? — Blckknght, Apr 30 '21 at 02:14
This should not be a `dataclass`; all sorts of thigns will break. Are you aware of `types.SimpleNamespace`? — o11c, Apr 30 '21 at 02:19
I think the latter should be fine. The way I'm thinking is that I init some another class Model with several attributes that are initted by different State calls. Then I run some method from Model which I want to create some new attribute within each attribute that was created by State. — badbayesian, Apr 30 '21 at 02:22
@o11c: Why should it not be a dataclass? Im using this to store parameters for some model may be updated later. I guess one approach would be to define the parameters at init rather than to set them later. — badbayesian, Apr 30 '21 at 02:29
@badbayesian because you basically want a container mapping strings to some value, ints in this case. Note a dataclass, which represents record-like data. — juanpa.arrivillaga, Apr 30 '21 at 03:37
@juanpa.arrivillaga perhaps my example is too simple. the repr extension was more to debug and for a user to quickly check what parameters they are using. I will be storing record-like data. — badbayesian, Apr 30 '21 at 03:58
@badbayesian well, no, because you can dynamically change what that data would be. Records are fixed. You might as well just use a `dict` or as mentioned above, a `collections.NameSpace`. In any case, you can just use something like `return f"State<{repr(vars(self))}>` as a `__repr__` for something quick-and-dirty — juanpa.arrivillaga, Apr 30 '21 at 04:26
I think the confusion is that if you're extending the fields at runtime (e.g. adding a `C` attribute after the class is defined) you're *not* dealing with record-like data (which has fixed fields). Instead, your data is dynamic. That's OK, it's just not what `dataclasses` are for. — Blckknght, Apr 30 '21 at 04:27
So strictly speaking, state would be dynamic in that I am adding attributes after init. However, I do know all the attributes from the beginning, I could just init them all and then deal with typing/ignore values later. Currently I was just adding the intersection of attributes in all my states and then adding the remaining attributes through another method in model. After that call, the number of attributes for each state would not change for the rest of the program. — badbayesian, Apr 30 '21 at 05:47

score 5 · Answer 1 · answered Apr 30 '21 at 05:11

The dataclass decorator lets you quickly and easily build classes that have specific fields that are predetermined when you define the class. The way you're intending to use your class, however, doesn't match up very well with what dataclasses are good for. You want to be able to dynamically add new fields after the class already exists, and have them work with various methods (like __init__, __repr__ and presumably __eq__). That removes almost all of the benefits of using dataclass. You should instead just write your own class that does what you want it to do.

Here's a quick and dirty version:

class State:
    _defaults = {"A": 1, "B": 2}
    
    def __init__(self, **kwargs):
        self.__dict__.update(self._defaults)
        self.__dict__.update(kwargs)
        
    def __eq__(self, other):
        return self.__dict__ == other.__dict__ # you might want to add some type checking here
        
    def __repr__(self):
        kws = [f"{key}={value!r}" for key, value in self.__dict__.items()]
        return "{}({})".format(type(self).__name__, ", ".join(kws))

This is pretty similar to what you get from types.SimpleNamespace, so you might just be able to use that instead (it doesn't do default values though).

You could add your set method to this framework, though it seems to me like needless duplication of the builtin setattr function you're already using to implement it. If the caller needs to dynamically set an attribute, they can call setattr themselves. If the attribute name is constant, they can use normal attribute assignment syntax instead s.foo = "bar".

I'll go a step further and suggest that this really should be a `dict`. IMO either you know the full set of attributes in advance, or you don't; even though Python lets you ignore the distinction. — shadowtalker, Apr 30 '21 at 05:21
That's certainly a possibility, though there can be cases where it's convenient to have attribute syntax. Though I guess most of the best cases are the ones where `dataclass` would work in the first place! — Blckknght, Apr 30 '21 at 05:23
So I do know the full set of attributes but they may not be the same type (list vs int) or I may ignore them in other models. If that is the case, then it seems like ya'll are suggesting that I just define all the parameters at init and then just ignore the ones I dont need or override the ones that are different types? I guess another option is to always use lists even when all the values are the same in the list. — badbayesian, Apr 30 '21 at 05:40
I think there's a few other options. One is to specify separate dataclasses for each set of attributes you might need together. Another is to be fully dynamic, which is what I showed. Your idea of specifying everything (with defaults, or maybe `None` for unused attributes) is a third option. — Blckknght, Apr 30 '21 at 07:02

score 0 · Answer 2 · answered Jun 12 '23 at 15:22

I think the cleanest solution is to reuse the internals of dataclasses: which is pretty much mimicking a super().__repr__ for other use cases.

@dataclasses.dataclass
class Component:
    id: str
    categories: list[str] = dataclasses.field(default_factory=list)

    @property
    def url(self) -> str:
        """
        Component url in logbook
        """
        return f"http://example.com/components?id={self.id}"])

    def __repr__(self):
        """
        Inserting the logbook url into the repr after the 'id' field.
        """
        fields = dataclasses.fields(self)
        fields = [f for f in fields if f.repr]
        repr_fn = dataclasses._repr_fn(dataclasses.fields(self), {})
        parent_repr = repr_fn(self)
        # super().__repr__() would not work because it gives object.__repr__
        parts = parent_repr.split(",")  # Split the representation by commas
        additional_info = f", url={self.url}"  # Additional property information
        parts.insert(
            1, additional_info
        )  # Insert the additional info after the 'name' field
        return ", ".join(parts)  # Join the parts back together

Extend dataclass' repr programmatically

2 Answers2

Linked

Extend dataclass' __repr__ programmatically

2 Answers2

Linked

Extend dataclass' repr programmatically