5

I often see code like this:

@dataclass
class ClassName:
    list_name: list[int] = field(default_factory=list)

but I don't understand why I need to type field(default_factory=list). Isn't list_name: list[int] already enough?

Could you please explain when, why and how to use field() in dataclass?

cwallenwein
  • 538
  • 7
  • 20

1 Answers1

9

Doing:

list_name: list[int]

Is not the same as:

list_name: list[int] = field(default_factory=list)

the latter uses a default_factory, so Note:

>>> from dataclasses import dataclass
>>> @dataclass
... class ClassName:
...     list_name: list[int]
...
>>> ClassName()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() missing 1 required positional argument: 'list_name'

But if you use the default factory:

>>> @dataclass
... class ClassName:
...     list_name: list[int] = field(default_factory=list)
...
>>> ClassName()
ClassName(list_name=[])

Note, if you did list_name: list[int] = [] then it would re-use the same list object for every insance, almost certainly not what you want. Indeed, the dataclass decorator explicitly rejects this:

>>> @dataclass
... class ClassName:
...     list_name: list[int] = []
...
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/Users/juanarrivillaga/opt/miniconda3/envs/py39/lib/python3.9/dataclasses.py", line 1021, in dataclass
    return wrap(cls)
  File "/Users/juanarrivillaga/opt/miniconda3/envs/py39/lib/python3.9/dataclasses.py", line 1013, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash, frozen)
  File "/Users/juanarrivillaga/opt/miniconda3/envs/py39/lib/python3.9/dataclasses.py", line 863, in _process_class
    cls_fields = [_get_field(cls, name, type)
  File "/Users/juanarrivillaga/opt/miniconda3/envs/py39/lib/python3.9/dataclasses.py", line 863, in <listcomp>
    cls_fields = [_get_field(cls, name, type)
  File "/Users/juanarrivillaga/opt/miniconda3/envs/py39/lib/python3.9/dataclasses.py", line 747, in _get_field
    raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'list'> for field list_name is not allowed: use default_factory

In general, you use a dataclasses.field explicitly when you need to tinker with individual fields.

juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
  • for my knowledge, so in simple terms it allows creating of new instances as default values for fields? – sahasrara62 Apr 08 '21 at 20:08
  • 1
    @sahasrara62 yes, that is what the `default_factory` does, it takes a callable, which is called every time that field is not provided and the result of the call is used as the default value – juanpa.arrivillaga Apr 08 '21 at 20:10