48

I have a function that accepts an instance of any dataclass. what would be an appropriate type hint for it ?

haven't found something official in the python documentation


this is what I have been doing, but i don't think it's correct

from typing import Any, NewType

DataClass = NewType('DataClass', Any)
def foo(obj: DataClass):
    ...

another idea is to use a Protocol with these class attributes __dataclass_fields__, __dataclass_params__.

moshevi
  • 4,999
  • 5
  • 33
  • 50
  • Uh, what? There's no observable difference between a class with an `@dataclass` decorator and a class without. Dataclasses don't implement any special methods and don't have any special attributes. Distinguishing between a "dataclass" and a "regular" class makes no sense whatsoever. – Aran-Fey Feb 13 '19 at 10:39
  • 5
    The function unpacks a data class to a dictionary, and they have special attributes, `__dataclass_fields__`, `__dataclass_params__`. as stated in the question. the same thing can be said about namedtuples and the do have a type hint even though they are simply inheriting from `tuple` – moshevi Feb 13 '19 at 10:40
  • Those attributes are undocumented and thus I would advise against relying on their existence. I was wrong about there being no observable difference though; functions like [`dataclasses.astuple`](https://docs.python.org/3/library/dataclasses.html#dataclasses.astuple) only work with dataclasses. – Aran-Fey Feb 13 '19 at 10:46
  • so a `Protocol` with a `astuple` method ? sounds good, but a bit precarious. not sure why they decided to create `dataclass`es with a decorator and not via inheritance and meta classes like `namedtuple`s. – moshevi Feb 13 '19 at 11:35
  • `astuple` is not a method, so that's not gonna work. I don't think this can be done with `typing`, since dataclasses technically aren't a type. They don't expose a base class or a specific public interface. In other words, receiving a non-dataclass instead of a dataclass is closer to a ValueError than a TypeError. – Aran-Fey Feb 13 '19 at 11:42
  • 2
    right you are, I read the source code, and python actually implements a function `_is_dataclass_instance`. It checks if it has the attribute `__dataclass_fields__`, I think this is as good as it gets. – moshevi Feb 13 '19 at 11:45

3 Answers3

43

Despite its name, dataclasses.dataclass doesn't expose a class interface. It just allows you to declare a custom class in a convenient way that makes it obvious that it is going to be used as a data container. So, in theory, there is little opportunity to write something that only works on dataclasses, because dataclasses really are just ordinary classes.

In practice, there a couple of reasons why you would want to declare dataclass-only functions anyway, and something like this is how you should go about it:

from dataclasses import dataclass
from typing import ClassVar, Dict, Protocol


class IsDataclass(Protocol):
    # as already noted in comments, checking for this attribute is currently
    # the most reliable way to ascertain that something is a dataclass
    __dataclass_fields__: ClassVar[Dict] 

def dataclass_only(x: IsDataclass):
    ...  # do something that only makes sense with a dataclass

@dataclass
class Foo:
    pass

class Bar:
    pass

dataclass_only(Foo())  # a static type check should show that this line is fine ..
dataclass_only(Bar())  # .. and this one is not

This approach is also what you alluded to in your question. If you want to go for it, keep in mind that you'll need a third party library such as mypy to do the static type checking for you, and if you are on python 3.7 or earlier, you need to manually install typing_extensions since Protocol only became part of the standard library in 3.8.

Also noted that older version of mypy (>=0.982) mistakenly expect __dataclass_fields__ to be an instance attribute, so the protocol should be just __dataclass_fields__: Dict[1].


When I first wrote it, this post also featured The Old Way of Doing Things, back when we had to make do without type checkers. I'm leaving it up, but it's not recommended to handle this kind of feature with runtime-only failures any more:

from dataclasses import is_dataclass

def dataclass_only(x):
    """Do something that only makes sense with a dataclass.
    
    Raises:
        ValueError if something that is not a dataclass is passed.
        
    ... more documentation ...
    """
    if not is_dataclass(x):
        raise ValueError(f"'{x.__class__.__name__}' is not a dataclass!")
    ...

[1]Kudos to @Kound for updating and testing the ClassVar behavior.

Arne
  • 17,706
  • 5
  • 83
  • 99
  • would it be possible to monkeypatch inheriting from a dummy type onto the `@dataclass` annotator, hidden behind `if TYPE_CHECKING` so that it only impacts type checking? (i.e. `IAmADataclass = type('IAmADataclass', (), {}` and when you use `@dataclass Class Foo` it effectively replaces it with `@dataclass Class Foo(IAmADataclass)` – user3534080 Oct 28 '19 at 07:19
  • @user3534080 That's a complicated question you managed to fit into your comment. It deserves an answer of its own, but the short answer is that for practical purposes what you want is not very useful. Patching inheritance at runtime means that static analysis tools like mypy or the ones used by IDEs to give suggestions will not pick them up correctly. – Arne Oct 29 '19 at 09:41
  • Looks like the linked github issue has been resolved and this answer might need a quick update – Gricey Oct 19 '21 at 01:44
  • 3
    @Gricey Thanks for the heads-up. The linked fix is planned to be released with version 0.920 of mypy, which isn't out as of yet. I'll keep an eye on it and update my post once that is the case, though. – Arne Oct 19 '21 at 12:13
  • @Arne Looks like mypy 0.920 is out as of now. – Cnoor0171 Feb 14 '22 at 21:05
  • Just submitted another edit: MyPy (0.991) needs the `__dataclass_fields__` within the Protocol to be a `ClassVar` which they are are actually are. Adding this comment for other people that also might wonder why there mypy is now failing. – Kound Jan 30 '23 at 15:01
  • Ok just checked, this works only if Mypy is >=0.990, versions below don't accept this. See also Example on [MyPy Playground](https://mypy-play.net/?mypy=0.990&python=3.10&gist=e0a039b290f8bac0656f8342dc3177b6) – Kound Jan 30 '23 at 15:19
5

There is a helper function called is_dataclass that can be used, its exported from dataclasses.

Basically what it does is this:

def is_dataclass(obj):
    """Returns True if obj is a dataclass or an instance of a
    dataclass."""
    cls = obj if isinstance(obj, type) else type(obj)
    return hasattr(cls, _FIELDS) 

It gets the type of the instance using type, or if the object extends type, the object itself.

It then checks if the variable _FIELDS, which equals __dataclass_fields__, exists on this object. This is basically equivalent to the other answers here.

To "type" dataclass i would do something like this:


class DataclassProtocol(Protocol):
    __dataclass_fields__: Dict
    __dataclass_params__: Dict
    __post_init__: Optional[Callable]
2

You can indeed use a Protocol, but by I suggest @ decorating that Protocol as a runtime_checkable dataclass:

@runtime_checkable
@dataclasses.dataclass
class DataclassProtocol(Protocol):
    pass

The above results in:

  • type-hinting with DataclassProtocol is possible and makes sense to type checkers (mypy 0.982, PyCharm 2022.2.3 CE)
  • isinstance(obj, DataclassProtocol) is equivalent to dataclasses.is_dataclass(obj)
  • because dataclasses.is_dataclass(DataclassProtocol), type checkers' special handling of dataclasses work
  • DataclassProtocol does not require use of internal dataclass fields

The first is also accomplished by the previously given Protocols. The second results from @ decorating with runtime_checkable. The latter two points rely on @ decorating by dataclass.

While this answers the question, personally I'd want to subclass DataclassProtocol into a DataclassInstanceProtocol, which specializes for not isinstance(obj, type). But until now, I couldn't find that.

Olaf
  • 21
  • 1
  • This solution only seems to work with an instance of a dataclass. For the class itself, mypy complains: `Argument 1 to "fun" has incompatible type "Type[DC]"; expected "DataclassProtocol" [arg-type]` (tested by passing a dataclass `DC` to `fun(dc: DataclassProtocol)`). – Noa Nov 09 '22 at 13:39
  • Type stub docs seem to suggest using @dataclass: https://typing.readthedocs.io/en/latest/source/stubs.html#decorators. – wrwrwr Mar 24 '23 at 20:26