Change class into dataclass

Question

I want to refactor a class into a dataclass, here is what I have:

class Tenant:
   def __init__(self, tenant: dict, api_client: str) -> None:
      self.name = tenant["name"]
      self.tenant_id= tenant["id"]
      self.subdomain = tenant["subdomain"]
      self.api_client= api_client

What would be the same class as a dataclass? I tried with something like that, but I don't know how to separate this dict into name, tenant_id and subdomain:

@dataclass
class Tenant:
   tenant: dict
   api_client: str

IMO this should be a factory method named something like `from_dict`. The "normal" constructor should take `name`/`id`/`subdomain`/`api_client` as individual args. — 0x5453, Sep 08 '22 at 12:37
Define "the same". Your original class holds 4 attributes, but accepts only two arguments, and constructs itself from those those arguments. So the question is, where to put this destructuring/construction logic. It hardly makes sense to override the dataclass' `__init__`, since that's most of the benefit a `dataclass` provides you. — So, again, in what way should they be "the same" and where is it okay for them to differ? — deceze, Sep 08 '22 at 12:37
Redesign your class - don't accept dict and convert it to instance attributes unless you have a reason to! If you really need that, be very explicit about that and use the helper function suggested below. — Nishant, Sep 08 '22 at 12:38
"I want to refactor a class into a dataclass" - don't bother. What dataclasses give you is not what you want. — user2357112, Sep 09 '22 at 20:13

chepner · Answer 1 · 2022-09-08T13:33:22.320

I would add a class method to the data class to extract the necessary values from a dict.

@dataclass
class Tenant:
    name: str
    tenant_id: int
    subdomain: str
    api_client: str

    @classmethod
    def from_dict(cls, tenant: dict, api_client: str):
        return cls(tenant["name"],
                   tenant["id"],
                   tenant["subdomain"],
                   api_client)


t1 = Tenant("alice", 5, "bar", "client")
t2 = Tenant.from_dict({"name": "bob", "id": 6, "subdomain": "foo"},
                      "client")

I would take the same approach even if Tenant were not a dataclass. An instance of Tenant is only interested in the values to assign to its attributes, not how those values are packaged prior to the instance being created.

If you must preserve the existing API for Tenant, you'll need to use an InitVar and the __post_init__ method.

from dataclasses import dataclass, InitVar, field


@dataclass
class Tenant:
    tenant: InitVar[dict]
    name: str = field(init=False)
    tenant_id: int = field(init=False)
    subdomain: str = field(init=False)
    api_client: str

    # Approximate __init__ method generated
    # def __init__(self, tenant, api_client):
    #     self.api_client = api_client
    #     self.__post_init__(tenant)

    def __post_init__(self, tenant):
        self.name = tenant["name"]
        self.tenant_id = tenant["id"]
        self.subdomain = tenant["subdomain"]

t = Tenant({"name": "bob", "id": 6, "subdomain": "foo"},
           "client")

tenant, as an InitVar, is passed to __init__ and __post_init__, but will not be used as an attribute for the other autogenerated methods. name, tenant_id, and subdomain will not be accepted as arguments to __init__, but will be used by the other autogenerated methods. You, however, are responsible for ensuring they are set correctly in __post_init__.

A possible hybrid approach to define a "private" class, and make the name Tenant refer to the class method.

def _from_dict(cls, tenant, api_client):
    return cls(tenant["name"],
               tenant["id"],
               tenant["subdomain"],
               api_client)

# Using make_dataclass just to make the class name
# 'Tenant' instead of '_Tenant'. You can use an
# ordinary class statement and patch _Teant.__name__
# instead.
_Tenant = dataclasses.make_dataclass(
      'Tenant',
      [('name', str),
       ('tenant_id', int),
       ('subdomain', str),
       ('api_client', str)],
      namespace={'from_dict': classmethod(_from_dict)}
     )

Tenant = _Tenant.from_dict

But I need Tenant to accept only 2 arguments and here it demands to have 4 — ganjin, Sep 08 '22 at 12:54

score 0 · Answer 2 · answered Sep 08 '22 at 13:02

As @chepner suggested, the idea of implementing a .from_dict method is definely pythonic and readable too, so i'm going to implement it in this code too.

Since the OP wants that the Class must have 2 arguments I would suggest to use a collections.namedtuple.

 collections.namedtuple(typename, field_names, *, rename=False, defaults=None, module=None)¶

Returns a new tuple subclass named typename. The new subclass is used to create tuple-like objects that have fields accessible by attribute lookup as well as being indexable and iterable

WARNING: I haven't tested the code yet

TenantnoAPI = namedtuple("Tenant_noAPI","name tenant_id subdomain")

@dataclass
class Tenant:
    tenantnoapi:TenantnoAPI
    api_client: str

    @classmethod
    def from_dict(cls, tenant: dict, api_client: str):
        return cls(TenantnoAPI(tenant["name"],
                                tenant["id"],
                                tenant["subdomain"]),
                   api_client)

rv.kvetch · Answer 3 · 2022-09-09T22:58:28.623

I wholly concur with @chepner about using something along the lines of a from_dict constructor class method; at least to me, that aligns with Python best practices in general.

Another option, if (for whatever reason) you didn't want to model your data with dataclasses, you could instead use a dot-access dict, such as one provided by my helper library dotwiz. This can be installed with pip install dotwiz, and is a relatively lightweight option that one could consider.

As a specific example:

from dotwiz import DotWiz


class Tenant(DotWiz):

    @classmethod
    def from_dict(cls, tenant: dict, api_client: str) -> 'Tenant':
        return cls(tenant_id=tenant.pop('id'), **tenant, api_client=api_client)


t = Tenant.from_dict({"name": "bob", "id": 6, "subdomain": "foo"}, "client")
print(t)  # ✫(tenant_id=6, name='bob', subdomain='foo', api_client='client')

assert t.name == 'bob'
assert t.tenant_id == 6

In case you want type checking, and field auto-completion:

from typing import TYPE_CHECKING
from dotwiz import DotWiz


if TYPE_CHECKING:

    class Tenant(DotWiz):
        name: str
        id: int
        subdomain: str
        api_client: str


# noinspection PyTypeChecker
def create_tenant(tenant: dict, api_client: str) -> 'Tenant':
    tenant['api_client'] = api_client
    return DotWiz(tenant)


t = create_tenant({"name": "bob", "id": 6, "subdomain": "foo"}, "client")

Full disclaimer: I am the creator and maintainer of this library.

Change class into dataclass

3 Answers3