11

Note: since this answer keeps getting upvoted - while there are still use cases for TypedDict, I'd consider using a dataclass instead today.


I want to have a nice (`mypy --strict` and pythonic) way to turn an untyped `dict` (from `json.loads()`) into a `TypedDict`. My current approach looks like this:
class BackupData(TypedDict, total=False):
    archive_name: str
    archive_size: int
    transfer_size: int
    transfer_time: float
    error: str


def to_backup_data(data: Mapping[str, Any]) -> BackupData:
    result = BackupData()
    if 'archive_name' in data:
        result['archive_name'] = str(data['archive_name'])
    if 'archive_size' in data:
        result['archive_size'] = int(data['archive_size'])
    if 'transfer_size' in data:
        result['transfer_size'] = int(data['transfer_size'])
    if 'transfer_time' in data:
        result['transfer_time'] = int(data['transfer_time'])
    if 'error' in data:
        result['error'] = str(data['error'])
    return result

i.e I have a TypedDict with optional keys and want a TypedDict instance.

The code above is redundant and non-functional (in terms of functional programming) because I have to write names four times, types twice and result has to be mutable. Sadly TypedDict can't have methods otherwise I could write s.th. like

backup_data = BackupData.from(json.loads({...}))

Is there something I'm missing regarding TypeDict? Can this be written in a nice, non-redundant way?

frans
  • 8,868
  • 11
  • 58
  • 132

1 Answers1

5

When you use a TypedDict, all information is stored in the __annotations__ field.

For your example:

BackupData.__annotations__

returns:

{'archive_name': <class 'str'>, 'archive_size': <class 'int'>, 'transfer_size': <class 'int'>, 'transfer_time': <class 'float'>, 'error': <class 'str'>}

Now we can use that dictionary to iterate over the data and use the values for type casting:

def to_backup_data(data: Mapping[str, Any]) -> BackupData:
    result = BackupData()
    for key, key_type in BackupData.__annotations__.items():
        if key not in data:
            raise ValueError(f"Key: {key} is not available in data.")
        result[key] = key_type(data[key])
    return result

Note that I throw an error when the data is not available, this can be changed at your discretion.

With the following test code:

data = dict(
        archive_name="my archive",
        archive_size="50",
        transfer_size="100",
        transfer_time="2.3",
        error=None,
)

for key, value in result.items():
    print(f"Key: {key.ljust(15)}, type: {str(type(value)).ljust(15)}, value: {value!r}")

The result will be:

Key: archive_name   , type: <class 'str'>  , value: 'my archive'
Key: archive_size   , type: <class 'int'>  , value: 50
Key: transfer_size  , type: <class 'int'>  , value: 100
Key: transfer_time  , type: <class 'float'>, value: 2.3
Key: error          , type: <class 'str'>  , value: 'None'
ubert
  • 95
  • 6
Thymen
  • 2,089
  • 1
  • 9
  • 13
  • While I love this approach, it could lead to some unexpected behavior as-is. If, for example, in the source dict, `archive_name` was `None`, the resulting data would appear as `"None"`, as `str(None) -> "None"`. – Joe Sadoski Jul 08 '22 at 14:10
  • 1
    Hi Joe, that is true, that is why I provided an example with `error=None`, which results indeed in the string `'None'`. Unfortunately I didn't give more attention to it while typing the answer. In the case you would like to accept `None` as possible answer, the typing should be `Optional[str]`, for which the above solution wouldn't work. – Thymen Jul 11 '22 at 15:26
  • I think your answer is a great starting point, and more behavior should probably added as appropriate for whatever app it's in. I just wanted to warn future googlers! – Joe Sadoski Jul 12 '22 at 16:13