15

Note this is similar to How to get @property methods in asdict?.

I have a (frozen) nested data structure like the following. A few properties that are (purely) dependent on the fields are defined.

import copy
import dataclasses
import json
from dataclasses import dataclass

@dataclass(frozen=True)
class Bar:
    x: int
    y: int

    @property
    def z(self):
        return self.x + self.y

@dataclass(frozen=True)
class Foo:
    a: int
    b: Bar

    @property
    def c(self):
        return self.a + self.b.x - self.b.y

I can serialize the data structure as follows:

class CustomEncoder(json.JSONEncoder):
    def default(self, o):
        if dataclasses and dataclasses.is_dataclass(o):
            return dataclasses.asdict(o)
        return json.JSONEncoder.default(self, o)

foo = Foo(1, Bar(2,3))
print(json.dumps(foo, cls=CustomEncoder))

# Outputs {"a": 1, "b": {"x": 2, "y": 3}}

However, I would like to also serialize the properties (@property). Note I do not want to turn the properties into fields using __post_init__ as I would like to keep the dataclass' frozen. I do not want to use obj.__setattr__ to work around the frozen fields. I also do not want to pre-compute the values of the properties outside the class and pass them in as fields.

The current solution I am using is to explicitly write out how each object is serialized as follows:

class CustomEncoder2(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, Foo):
            return {
                "a": o.a,
                "b": o.b,
                "c": o.c
            }
        elif isinstance(o, Bar):
            return {
                "x": o.x,
                "y": o.y,
                "z": o.z
            }
        return json.JSONEncoder.default(self, o)

foo = Foo(1, Bar(2,3))
print(json.dumps(foo, cls=CustomEncoder2))

# Outputs {"a": 1, "b": {"x": 2, "y": 3, "z": 5}, "c": 0} as desired

For a few levels of nesting, this is manageable but I am hoping for a more general solution. For example, here is a (hacky) solution that monkey-patches the _asdict_inner implementation from the dataclasses library.

def custom_asdict_inner(obj, dict_factory):
    if dataclasses._is_dataclass_instance(obj):
        result = []
        for f in dataclasses.fields(obj):
            value = custom_asdict_inner(getattr(obj, f.name), dict_factory)
            result.append((f.name, value))
        # Inject this one-line change
        result += [(prop, custom_asdict_inner(getattr(obj, prop), dict_factory)) for prop in dir(obj) if not prop.startswith('__')]
        return dict_factory(result)
    elif isinstance(obj, tuple) and hasattr(obj, '_fields'):
        return type(obj)(*[custom_asdict_inner(v, dict_factory) for v in obj])
    elif isinstance(obj, (list, tuple)):
        return type(obj)(custom_asdict_inner(v, dict_factory) for v in obj)
    elif isinstance(obj, dict):
        return type(obj)((custom_asdict_inner(k, dict_factory),
                          custom_asdict_inner(v, dict_factory))
                         for k, v in obj.items())
    else:
        return copy.deepcopy(obj)

dataclasses._asdict_inner = custom_asdict_inner

class CustomEncoder3(json.JSONEncoder):
    def default(self, o):
        if dataclasses and dataclasses.is_dataclass(o):
            return dataclasses.asdict(o)
        return json.JSONEncoder.default(self, o)

foo = Foo(1, Bar(2,3))
print(json.dumps(foo, cls=CustomEncoder3))

# Outputs {"a": 1, "b": {"x": 2, "y": 3, "z": 5}, "c": 0} as desired

Is there a recommended way to achieve what I am trying to do?

martineau
  • 119,623
  • 25
  • 170
  • 301
Kent Shikama
  • 3,910
  • 3
  • 22
  • 55

3 Answers3

6

It seems to contradict a convenient dataclass feature:

Class(**asdict(obj)) == obj  # only for classes w/o nested dataclass attrs

If you don't find any relevant pypi package you can always add a 2-liner like this:

from dataclasses import asdict as std_asdict

def asdict(obj):
    return {**std_asdict(obj),
            **{a: getattr(obj, a) for a in getattr(obj, '__add_to_dict__', [])}}

Then you can specify in a custom but short manner which ones you want in dicts:

@dataclass
class A:
    f: str
    __add_to_dict__ = ['f2']

    @property
    def f2(self):
        return self.f + '2'



@dataclass
class B:
    f: str

print(asdict(A('f')))
print(asdict(B('f')))

:

{'f2': 'f2', 'f': 'f'}
{'f': 'f'}
martineau
  • 119,623
  • 25
  • 170
  • 301
Kroshka Kartoshka
  • 1,035
  • 5
  • 23
  • 1
    Unfortunately, the call to `std_asdict` would not output any nested dataclass' `@property`'s. – Kent Shikama Nov 11 '20 at 08:02
  • `Class(**asdict(obj)) == obj` doesn't hold for any nested dataclass even with the default implementation as it doesn't automatically convert the inner dictionaries. – Kent Shikama Nov 11 '20 at 08:13
  • That's true. Do you know any nice package with a `fromdict` function? `dataclass-json`? – Kroshka Kartoshka Nov 11 '20 at 12:32
  • @KroshkaKartoshka I know I'm late, but try checking out marshmallow-dataclass, I'm using it to deserialize dicts into my dataclasses (works with nested classes too) – HitLuca Aug 17 '21 at 20:02
6

If applicable to your solution, you can define the attrs on a base class and have concrete classes implement the properties. This works with asdict.

from dataclasses import asdict, dataclass, field

@dataclass
class Liquid:
    volume: int
    price: int
    total_cost: int = field(init=False)


class Milk(Liquid):
    volume: int
    price: int

    @property
    def total_cost(self):
        return self.volume * self.price


milk = Milk(10, 3)

print(asdict(milk))
>>> {'volume': 10, 'price': 3, 'total_cost': 30}
Paulo Freitas
  • 13,194
  • 14
  • 74
  • 96
foxyblue
  • 2,859
  • 2
  • 21
  • 29
3

There's no "recommended" way to include them that I know of.

Here's something that seems to work and I think meets your numerous requirements. It defines a custom encoder that calls its own _asdict() method when the object is a dataclass instead of monkey-patching the (private) dataclasses._asdict_inner() function and encapsulates (bundles) the code within the customer encoder that makes use of it.

Like you, I used the current implementation of dataclasses.asdict() as a guide/template since what you're asking for is basically just a customized version of that. The current value of each field that's a property is obtained by calling its __get__ method.

import copy
import dataclasses
from dataclasses import dataclass, field
import json
import re
from typing import List

class MyCustomEncoder(json.JSONEncoder):
    is_special = re.compile(r'^__[^\d\W]\w*__\Z', re.UNICODE)  # Dunder name.

    def default(self, obj):
        return self._asdict(obj)

    def _asdict(self, obj, *, dict_factory=dict):
        if not dataclasses.is_dataclass(obj):
            raise TypeError("_asdict() should only be called on dataclass instances")
        return self._asdict_inner(obj, dict_factory)

    def _asdict_inner(self, obj, dict_factory):
        if dataclasses.is_dataclass(obj):
            result = []
            # Get values of its fields (recursively).
            for f in dataclasses.fields(obj):
                value = self._asdict_inner(getattr(obj, f.name), dict_factory)
                result.append((f.name, value))
            # Add values of non-special attributes which are properties.
            is_special = self.is_special.match  # Local var to speed access.
            for name, attr in vars(type(obj)).items():
                if not is_special(name) and isinstance(attr, property):
                    result.append((name, attr.__get__(obj)))  # Get property's value.
            return dict_factory(result)
        elif isinstance(obj, tuple) and hasattr(obj, '_fields'):
            return type(obj)(*[self._asdict_inner(v, dict_factory) for v in obj])
        elif isinstance(obj, (list, tuple)):
            return type(obj)(self._asdict_inner(v, dict_factory) for v in obj)
        elif isinstance(obj, dict):
            return type(obj)((self._asdict_inner(k, dict_factory),
                              self._asdict_inner(v, dict_factory)) for k, v in obj.items())
        else:
            return copy.deepcopy(obj)


if __name__ == '__main__':

    @dataclass(frozen=True)
    class Bar():
        x: int
        y: int

        @property
        def z(self):
            return self.x + self.y


    @dataclass(frozen=True)
    class Foo():
        a: int
        b: Bar

        @property
        def c(self):
            return self.a + self.b.x - self.b.y

        # Added for testing.
        d: List = field(default_factory=lambda: [42])  # Field with default value.


    foo = Foo(1, Bar(2,3))
    print(json.dumps(foo, cls=MyCustomEncoder))

Output:

{"a": 1, "b": {"x": 2, "y": 3, "z": 5}, "d": [42], "c": 0}
martineau
  • 119,623
  • 25
  • 170
  • 301
  • Thanks for the input. I'm starting to become convinced that this custom method would involve mostly copying `_asdict_inner`. I'm a bit disappointed that there isn't default support for this kind of use. – Kent Shikama Nov 12 '20 at 22:51
  • I wouldn't have even expected there to be a standard way to handle what you want to do, which is fairly unusual IMO. One could argue that the "value" of a `property` is its code, for example. Anyway, I think you should accept my answer even though the gist of it is "there isn't a recommended way" because besides that it also contains a viable workaround that doesn't involve monkey-patching the `dataclasses` module. – martineau Nov 12 '20 at 23:27
  • Are you using `return str(obj)` as a stub because handling `list`, `tuple`, `dict`, deepcopy, etc would make the answer too long? – Kent Shikama Nov 12 '20 at 23:41
  • I'm surprised you think what I am trying to do is fairly unusual. Do people not create `dataclass`es with `@property`'s? Or is it the fact that properties usually aren't serialized with the underlying data? – Kent Shikama Nov 12 '20 at 23:44
  • The `return str(obj)` is an oversight leftover from earlier experiments with the code. One issue with them is that they might contain nested dataclasses with the issues and limitations explained in the source code. I don't think that properties are used in conjunction with dataclasses that often — and when it is done, that there would be any expectation of them being represented in the returned dictionary as anything other than what they actually were (if they weren't ignored). – martineau Nov 13 '20 at 00:00
  • 1
    I added the code to handle the omitted types — and I don't think it made the code too long. You might find the blog post [Reconciling Dataclasses And Properties In Python](https://florimond.dev/blog/articles/2018/10/reconciling-dataclasses-and-properties-in-python/) interesting, esp near the end of **Attempt 5** subsection which says "...because **dataclasses were designed to be editable data containers**. If you really need read-only fields, you shouldn't be resorting to dataclasses in the first place." – martineau Nov 14 '20 at 22:20
  • 1
    I think you also might find the youtube video of Raymond Hettinger's PyCon 2018 talk [Dataclasses: The code generator to end all code generators](https://youtu.be/T-TwcmT6Rcw) generally worth watching — i.e. not specifically with respect to your question about `asdict` and properties. – martineau Nov 15 '20 at 19:32