18

Let's say I have a class like this:

class C:

    def __init__(self, stuff: int):
        self._stuff = stuff

    @property
    def stuff(self) -> int:
        return self._stuff

then stuff is read-only:

c = C(stuff=10)
print(c.stuff)  # prints 10

and

c.stuff = 2

fails as expected

AttributeError: can't set attribute

How can I get the identical behavior using a dataclass? If I wanted to also have a setter, I could do:

@dataclass
class DC:
    stuff: int
    _stuff: int = field(init=False, repr=False)

    @property
    def stuff(self) -> int:
        return self._stuff

    @stuff.setter
    def stuff(self, stuff: int):
        self._stuff = stuff

But how could I do it without the @stuff.setter part?

Cleb
  • 25,102
  • 20
  • 116
  • 151
  • Leave out the setter? Why is the data class part relevant here? – jonrsharpe Apr 15 '20 at 19:34
  • @jonrsharpe: leaving out the `setter` doe snot help, then I can still set `stuff`: `dc = DC(stuff=10), dc.stuff = 2` will work just "fine"; but it should also fail. – Cleb Apr 15 '20 at 19:37
  • 1
    Perhaps that would be helpful in the question - did you mean to include a setter implementation that threw an error? – jonrsharpe Apr 15 '20 at 19:38
  • @juanpa.arrivillaga: That seems to freeze it for all attributes. If I have e.g. `stuff` and `more_stuff` but only want to have `stuff` as read-only but `more_stuff` should be settable, that doe snot seem to work. – Cleb Apr 15 '20 at 19:41
  • @jonrsharpe: It should fail once I do `dc = DC(10), dc.stuff=2` like in the first example using the "normal" class. – Cleb Apr 15 '20 at 19:42
  • @juanpa.arrivillaga: Indeed, that's what I will probably end up with, was just curious whether there is a straightforward way of doing it. – Cleb Apr 15 '20 at 19:43
  • 1
    @Cleb yeah, `property` just doesn't play nice, for various reasons – juanpa.arrivillaga Apr 15 '20 at 19:45
  • @juanpa.arrivillaga: After reading the answer you linked to above, I indeed think it is best to not use a dataclass for this. Thanks! – Cleb Apr 15 '20 at 19:50
  • @Cleb well, you can actually still do it, there's a workaround that works for *this particular case* but not for the one in the link, which wanted a default value. Still no workaround for that particular case – juanpa.arrivillaga Apr 15 '20 at 19:51
  • I wrote [a post](https://stackoverflow.com/questions/58532383/dataclass-style-object-with-mutable-and-immutable-properties) a while ago about how to extend data classes to support being only partly frozen. Would that help? – Arne Apr 16 '20 at 10:55
  • @Arne: Thanks, quite an impressive piece of work, but I think I just stick to a standard class then... Ideally, one can use `field(frozen=True)`, that would be highly convenient. – Cleb Apr 16 '20 at 12:16
  • 1
    Yeah, it's not exactly a straight forward solution. Seems like dataclasses are just not a good fit here. Maybe I'll play around with `field`, if I get it to work the way you want it I'll write an answer here. – Arne Apr 16 '20 at 12:32
  • I've added [another post](https://stackoverflow.com/a/74032346/10237506) that might help in this scenario, this uses a descriptor class `Frozen` and would just involve a simple assignment, like `Frozen()` in place of `field(frozen=True)`. – rv.kvetch Oct 19 '22 at 16:58

7 Answers7

4
from dataclasses import dataclass

@dataclass(frozen=True)
class YourClass:
    """class definition"""

https://docs.python.org/3/library/dataclasses.html#frozen-instances

After instantiation of the class, when trying to change any of its properties, the exception is raised.

Michal Vašut
  • 187
  • 2
  • 8
  • 16
    Thanks, but that freezes all attributes, but I would like to freeze only certain ones, not all... – Cleb Jul 28 '21 at 20:51
4

This answer extends directly from my other post on using descriptor classes, which is a convenient and handy way to define properties, more or less.

Since dataclasses does not offer a field(frozen=True) approach, I think this one can instead work for you.

Here is a straightforward example of usage below:

from dataclasses import dataclass, MISSING
from typing import Generic, TypeVar

_T = TypeVar('_T')


class Frozen(Generic[_T]):
    __slots__ = (
        '_default',
        '_private_name',
    )

    def __init__(self, default: _T = MISSING):
        self._default = default

    def __set_name__(self, owner, name):
        self._private_name = '_' + name

    def __get__(self, obj, objtype=None):
        value = getattr(obj, self._private_name, self._default)
        return value

    def __set__(self, obj, value):
        if hasattr(obj, self._private_name):
            msg = f'Attribute `{self._private_name[1:]}` is immutable!'
            raise TypeError(msg) from None

        setattr(obj, self._private_name, value)


@dataclass
class DC:
    stuff: int = Frozen()
    other_stuff: str = Frozen(default='test')


dc = DC(stuff=10)

# raises a TypeError: Attribute `stuff` is immutable!
# dc.stuff = 2

# raises a TypeError: Attribute `other_stuff` is immutable!
# dc.other_stuff = 'hello'

print(dc)

# raises a TypeError: __init__() missing 1 required positional argument: 'stuff'
# dc = DC()

Another option, is to use a metaclass which automatically applies the @dataclass decorator. This has a few advantages, such as being able to use dataclasses.field(...) for example to set a default value if desired, or to set repr=False for instance.

Note that once @dataclass_transform comes out in PY 3.11, this could potentially be a good use case to apply it here, so that it plays more nicely with IDEs in general.

In any case, here's a working example of this that I was able to put together:

from dataclasses import dataclass, field, fields


class Frozen:
    __slots__ = ('private_name', )

    def __init__(self, name):
        self.private_name = '_' + name

    def __get__(self, obj, objtype=None):
        value = getattr(obj, self.private_name)
        return value

    def __set__(self, obj, value):
        if hasattr(obj, self.private_name):
            msg = f'Attribute `{self.private_name[1:]}` is immutable!'
            raise TypeError(msg) from None

        setattr(obj, self.private_name, value)


def frozen_field(**kwargs):
    return field(**kwargs, metadata={'frozen': True})


def my_meta(name, bases, cls_dict):
    cls = dataclass(type(name, bases, cls_dict))

    for f in fields(cls):
        # if a dataclass field is supposed to be frozen, then set
        # the value to a descriptor object accordingly.
        if 'frozen' in f.metadata:
            setattr(cls, f.name, Frozen(f.name))

    return cls


class DC(metaclass=my_meta):
     other_stuff: str
     stuff: int = frozen_field(default=2)


# DC.stuff = property(lambda self: self._stuff)

dc = DC(other_stuff='test')
print(dc)

# raises TypeError: Attribute `stuff` is immutable!
# dc.stuff = 41

dc.other_stuff = 'hello'
print(dc)
rv.kvetch
  • 9,940
  • 3
  • 24
  • 53
  • This suffers from the same problem that mixing descriptors with dataclasses, see what happens with just `DC()`, you get `DC(stuff=<__main__.Frozen object at 0x108916250>)` since the descriptor is used as a "default" value! – juanpa.arrivillaga Oct 19 '22 at 19:29
  • @juanpa.arrivillaga actually this is not the case, since we don't use `field(default=Frozen())`. Having it like `DC()` won't work in the first example at all, as no default value is set. In the second example, omitting the `stuff` argument to the constructor should work as expected to set a default value. – rv.kvetch Oct 19 '22 at 19:36
  • That's weird, it must be special casing descriptors or something, because if you just use `class Foo: pass` then use `stuff: int = Foo()` you'll see how `DC()` works by using the `Foo` object as a default value – juanpa.arrivillaga Oct 19 '22 at 19:43
  • Wow, this is really strange, if I create a descriptor with `__get__` or with `__get__` and `__set__` it also just uses it as a default value, I don't understand why your implementation doesn't, what am I missing? – juanpa.arrivillaga Oct 19 '22 at 19:45
  • Oh, interesting... if I make `__get__` raise an `AttributeError`, this doesn't use it as a default value. This is all very strange and as far as I am aware, not documentd. – juanpa.arrivillaga Oct 19 '22 at 19:48
  • Hmm, I think you are right, dataclasses does sort of special case it, but rather unintentionally. I feel it is this line in dataclasses: `default = getattr(cls, a_name, MISSING)`. It uses `MISSING` as the default in case `getattr` raises an error, as we do in this case, since the attribute `self.private_name` doesn't exist initially. I've confirmed with `getattr(DC, 'stuff')` that this results in an AttributeError by default - but since dataclasses handles this by providing a third argument to `getattr`, it never notices this behavior - and treats the field `stuff` as having no default value. – rv.kvetch Oct 19 '22 at 19:58
  • 1
    Yeah, and I think this is a problem with `property` because `__get__` returns `self` in the case where `obj is None` – juanpa.arrivillaga Oct 19 '22 at 20:05
  • 1
    I guess another approach is to be more explicit in the descriptor implementation. Rather than raise an AttributeError, we can also return `MISSING`, and there would not be any noticeable change. For ex. `def __get__(self, obj, objtype=None): return getattr(obj, self.private_name, MISSING)` – rv.kvetch Oct 19 '22 at 20:06
  • I also updated my answer so the descriptor can support a default value, like `Frozen(42)`. – rv.kvetch Oct 19 '22 at 20:14
  • Does this now work? Or does it still has problems? – Gabriel Nov 25 '22 at 09:19
3

To get the boilerplate reduction that dataclass provides I found the only way to do this is with a descriptor.

In [236]: from dataclasses import dataclass, field
In [237]: class SetOnce:
     ...:     def __init__(self):
     ...:         self.block_set = False
     ...:     def __set_name__(self, owner, attr):
     ...:         self.owner = owner.__name__
     ...:         self.attr = attr
     ...:     def __get__(self, instance, owner):
     ...:         return getattr(instance, f"_{self.attr}")
     ...:     def __set__(self, instance, value):
     ...:         if not self.block_set:
     ...:             self.block_set = True
     ...:             setattr(instance, f"_{self.attr}", value)
     ...:         else:
     ...:             raise AttributeError(f"{self.owner}.{self.attr} cannot be set.")

In [239]: @dataclass
     ...: class Foo:
     ...:     bar:str = field(default=SetOnce())

In [240]: test = Foo("bar")

In [241]: test.bar
Out[241]: 'bar'

In [242]: test.bar = 1
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-242-9cc7975cd08b> in <module>
----> 1 test.bar = 1

<ipython-input-237-bddce9441c9a> in __set__(self, instance, value)
     12             self.value = value
     13         else:
---> 14             raise AttributeError(f"{self.owner}.{self.attr} cannot be set.")
     15

AttributeError: Foo.bar cannot be set.

In [243]: test
Out[247]: Foo(bar='bar')
Melendowski
  • 404
  • 4
  • 16
  • Please see my edit, when I wrote this I was still learning descriptors. Well I still am but I realized I need to store the value on a private attribute of the instance, rather than on the descriptor instance (which is attached to the class). – Melendowski Nov 09 '20 at 13:53
  • This causes unexpected behavior when value is not passed in to the constructor, for ex. like `test = Foo()`. It still works but the repr is printed weirdly, like `DC(stuff=<__main__.SetOnce object at 0x1032aefa0>)`. – rv.kvetch Oct 19 '22 at 17:30
  • Yeah, coming back to this almost 2 years later, I see what you are talking about. If you don't pass a value to the constructor then the descriptor is assigned. That is problematic. – Melendowski Oct 19 '22 at 23:47
  • 1
    I believe the fix is rather straightforward in this case. Just like `bar:str = SetOnce()` instead of using `field(...)` should work at least with dataclasses. – rv.kvetch Oct 20 '22 at 04:00
2

You can do it by combining three things:

  • Set frozen to False (the default);
  • Use __post_init__, which is called after the auto-generated __init__ finishes, to mark when the initial setting of values is set and the read-only behavior has to start;
  • Create your own version of __setattr__ to enforce the read-only behavior after the initial assignment.

Example Person class with a read-only ID field and a read-write name field:

from dataclasses import dataclass

@dataclass
class Person(object):
    id : str
    name : str

    def __post_init__(self):
        self._initialized = True

    def __setattr__(self, key, value):
        if "_initialized" not in self.__dict__:
            # we are still inside __init__, assign all values
            super().__setattr__(key, value)
        elif key == 'id':
            # __init__ has finished, enforce read-only attributes
            raise AttributeError(f'Attribute id is read-only')
        else:
            # set read-write attributes normally
            super().__setattr__(key, value)

p = Person(id="1234", name="John Doe")
p.name = "John Wick"                       # succeeds
p.id = "3456"                               # fails

I haven't implemented __delattr__ in this example, but it could follow the same logic we used on __setattr__.

Using a decorator so you don't need to write this much code for each class:

from typing import Optional, Iterable, Callable, Union
from dataclasses import dataclass

def readonlyattr(attrs : Optional[Union[str, Iterable[str]]] = None):
    # ensure attrs is a set of strings
    if isinstance(attrs, str):
        attrs = set([attrs])
    elif not isinstance(attrs, set):
        attrs = set(attrs)

    # return decorator
    def wrap_readonly_attributes(cls: type):
        # update post_init method
        def make_post_init(cls: type, method: Callable):
            def post_init(self, *args, **kwargs):
                self._initialized = True
                if method:
                    method(self, *args, **kwargs)
                else:
                    for base in cls.__bases__:
                        try:
                            getattr(base, "__post_init__")(self, *args, **kwargs)
                        except AttributeError:
                            pass
            return post_init
        setattr(cls, "__post_init__", make_post_init(cls, getattr(cls, "__post_init__", None)))

        # update setattr method
        def make_setattr(cls: type, method: Callable):
            def new_setattr(self, key, value):
                if "_initialized" not in self.__dict__:
                    if method:
                        method(self, key, value)
                    else:
                        super().__setattr__(key, value)
                elif key in attrs:
                    raise AttributeError(f'Attribute {key} is read-only')
                else:
                    if method:
                        method(self, key, value)
                    else:
                        super().__setattr__(key, value)
            return new_setattr
        setattr(cls, "__setattr__", make_setattr(cls, getattr(cls, "__setattr__", None)))

        return cls

    return wrap_readonly_attributes

@dataclass
@readonlyattr(["id", "passport_no"])
class Person(object):
    id : str
    passport_no : str
    name : str

p = Person(id="1234", passport_no="AB12345", name="John Doe")
print(p)
p.name = "John Wick"                       # succeeds
p.id = "3456"                              # fails
asieira
  • 3,513
  • 3
  • 23
  • 23
1

Because using the decorator in the class definition essentially triggers the @dataclass decorator to use the property object as a default field, it doesn't play nice. You can set the property outside like:

>>> from dataclasses import dataclass, field
>>> @dataclass
... class DC:
...     _stuff: int = field(repr=False)
...     stuff: int = field(init=False)
...
>>> DC.stuff = property(lambda self: self._stuff) # dataclass decorator cant see this
>>> dc = DC(42)
>>> dc
DC(stuff=42)
>>> dc.stuff
42
>>> dc.stuff = 99
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: can't set attribute
juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
  • Mmmh, that works, but I can then not do `dc = DC(stuff=10)`. I really use a normal class then... – Cleb Apr 15 '20 at 19:58
  • 2
    @Cleb you could implement a setter with some flag if it's been set once, then don't allowing setting further, but yes, it starts to lose it's convenience factor. – juanpa.arrivillaga Apr 15 '20 at 20:01
  • 3
    Using this method gives the ugly `repr` and as @Cleb points out, makes the `init` ugly too. Really unfortunate they don't have a `frozen` in the `dataclasses.field`, it just really doesn't make sense. – Melendowski Nov 07 '20 at 04:24
1
import operator

@dataclass
class Enum:
    name: str = property(operator.attrgetter("_name")) 

    def __init__(self, name):
        self._name = name
shaurun
  • 70
  • 6
0

I think that using a combination of InitVar and __post_init__ is the best way to go here.

InitVar allows you to denote that a dataclass field is init-only, and will not actually be a member of the class. These fields are also passed to __post_init__, where you can use them to set any private fields. Then you can use getters and setters to publicly expose the field at your discretion.

Applying this to your example:

from dataclasses import field, dataclass, InitVar

@dataclass
class DC:
    initial_stuff: InitVar[int]

    def __post_init__(self, initial_stuff):
      self._stuff = initial_stuff

    @property
    def stuff(self) -> int:
        return self._stuff

dc = DC(8)
dc.stuff       # returns 8
dc.stuff = 12  # raises AttributeError

Note that actually defining the private field _stuff as a dataclass field is not necessary, though it is also not harmful if you want to do so. If you only expose a getter, then attempting to set a value will fail not only at runtime, but it will also fail type checking with the appropriate error (this is from mypy):

error: Property "stuff" defined in "DC" is read-only  [misc]
Found 1 error in 1 file (checked 1 source file)

The only thing I don't like about this is that the init parameter and the actual field cannot have the same name. Since your example uses a positional argument, it doesn't really matter. But if you wanted a keyword argument instead, and you wanted it to also be named stuff, you could do that by defining an __init__ method yourself, instead of letting the dataclass define it for you. You lose a little bit of the benefit of dataclasses that way, but you would still maintain other features like automatic hash and repr functions, and the like.

Willie Conrad
  • 376
  • 2
  • 5