0

I want to create a python dataclass to hold program settings (e.g., file Paths) read from a yaml configuration file.

The issue is that my dataclass (Config) declares a ROOT_PATH field having a type of Path. However, when I print the field type, <class 'str'> is displayed. How can I get my python dataclass to understand that the setting in my yaml configuration file is a Path object rather than just a plain string?

# config.yml

---
ROOT_PATH: Path("/root/path")
# constants.py

from dataclasses import dataclass
from pathlib import Path

import yaml


@dataclass
class Config:
    ROOT_PATH: Path

if __name__ == "__main__":
    with open("./config.yml") as file:
        yml = yaml.safe_load(file.read())
        config = Config(**yml)
        print(f"Field type: {type(config.ROOT_PATH)}")


Output:
  Field type: <class 'str'>
user2514157
  • 545
  • 6
  • 24
  • _"I get my python dataclass to understand that the setting in my yaml configuration file is a Path object rather than just a plain string?"_ You can't because it's not. It _is_ a string. If you want to create a `Path` object from that string, you'll need to perform that conversion yourself. – Brian61354270 Mar 01 '23 at 00:09
  • It is a string. If you want it to be a `Path`, you must do that conversion yourself. Python type hints do not cause conversions between types. – Carcigenicate Mar 01 '23 at 00:09
  • I realize this is a dumb question but can you show me how/where to perform this conversion? Right now my dataclass fields populate magically from the unpacked yml dictionary. It is not clear to me if I should do this conversion when the field is declared, in an __init__ method, post-init, or with an decorator and how I should actually perform the conversion from the config dictionary. – user2514157 Mar 01 '23 at 00:41
  • 2
    You can just alter `yml` before passing it to `Config`. Something like `yml['ROOT_PATH'] = Path(yml['ROOT_PATH'])`. Or ya, do it in the `dataclass` `__post_init__`. If you expect to pass strings typically, doing the conversion in the class will be cleaner. – Carcigenicate Mar 01 '23 at 01:20

1 Answers1

0

The fundamental issues is the need to perform type conversion on the yml data from String to Path. This appears to supported in packages such as attrs and pydantic, but not dataclasses:

Bottom line: converters are a convenience without a valid workaround, and their absence will be frustrating to users....Very many classes in the real world perform some conversion of arguments within their init methods, and unlike validators I don't see a good alternative for those who don't want to perform conversions all over their code instead of in one place. https://github.com/ericvsmith/dataclasses/issues/60

However, I was able to use pydantic.validate_arguments to coerce the yaml String into a Path.

# constants.py

from dataclasses import dataclass
from pathlib import Path

import yaml
from pydantic import validate_arguments


@validate_arguments
@dataclass
class Config:
    ROOT_PATH: Path

if __name__ == "__main__":
    with open("./config.yml") as file:
        yml = yaml.safe_load(file.read())
        config = Config(**yml)
        print(f"Field type: {type(config.ROOT_PATH)}")


Output:
  Field type: <class 'pathlib.PosixPath'>

It is also possible to perform the the conversion in _post_init. (Force type conversion in python dataclass __init__ method)

def __post_init__(self):
    for field in dataclasses.fields(self):
        value = getattr(self, field.name)
        if not isinstance(value, field.type):
            # raise ValueError(f'Expected {field.name} to be {field.type}, '
                             f'got {repr(value)}')
            setattr(self, field.name, field.type(value))
user2514157
  • 545
  • 6
  • 24