8

I am adding type annotations to a lot of code to make it clear to other devs what my functions and methods do. How would I type annotate a function that takes JSON data in as an argument, and returns JSON data?

(very simplified version)

def func(json_data):
    return json_data

what I want to do but with JSON instead of int:

def add_nums(a: int, b: int) -> int:
    return a+b
Milind Sharma
  • 152
  • 1
  • 2
  • 10
  • 7
    JSON itself isn't a type; it's a syntax for strings to encode data structures. Specifying a type that accepts only strings that can be parsed as JSON is beyond the capabilities of Python type hints. – chepner Jun 27 '21 at 18:28
  • 1
    Does this answer your question? [Define a jsonable type using mypy / PEP-526](https://stackoverflow.com/questions/51291722/define-a-jsonable-type-using-mypy-pep-526) – enzo Jun 27 '21 at 18:50
  • 2
    `TypedDict`. One motivation is to represent JSON object https://www.python.org/dev/peps/pep-0589/#motivation – GabrielChu Aug 19 '21 at 07:35

4 Answers4

6

Json objects are usually like a bag of items. It can have numbers, string, floats, list and nested json objects. If you wish to deep dive into JSON body actual structure then following Option1 & 2 can assist you or you can do the 3 rd step.

First Option: Please check this Python3 docs links.

If you can clearly define your json body then you can use following example.

from collections.abc import Sequence

ConnectionOptions = dict[str, str]
Address = tuple[str, int]
Server = tuple[Address, ConnectionOptions]

def broadcast_message(message: str, servers: Sequence[Server]) -> None:
    ...

# The static type checker will treat the previous type signature as
# being exactly equivalent to this one.
def broadcast_message(
        message: str,
        servers: Sequence[tuple[tuple[str, int], dict[str, str]]]) -> None:
    ...

Second Option: you can also define you own custom type classes to work with, unlike above where you create lots of global items.

https://docs.python.org/3/library/typing.html#newtype


from typing import NewType

UserId = NewType('UserId', int)
some_id = UserId(524313)

def get_user_name(user_id: UserId) -> str:
    ...

Third Option: Like the above suggested answers, using a str is simple approach. Treat you json body as string and using json modules to convert it to string & viceversa

Fourth Option: Using a Library to define your classes - https://marshmallow.readthedocs.io/en/stable/

If you are working on Apache Spark then https://github.com/ketgo/marshmallow-pyspark is worth knowing about.

sam
  • 1,819
  • 1
  • 18
  • 30
4

You can not do that. There are no "json objects" in python. Json is represented as a string. The most correct answer here would be:

def func(json_data: str) -> str:
    return json_data

In my opinion (I also think it is best practice but not sure about that) you should only convert your data to json when you really need it in that format. Before that you should always be working with dictionaries and lists.

Kerrim
  • 485
  • 4
  • 9
0

Here's a kind of brute force solution: Just manually define some JSON types that cover generic JSON objects (as they would be represented in Python). You can't do recursive types in Python, but several nestings deep is usually enough for most use cases. After that we say best effort and allow Any.

You can create a module and define these types:

from collections.abc import (
    Mapping,
    Sequence,
)
from typing import (
    Any,
    Union,
)

PrimitiveJSON = Union[str, int, float, bool, None]

# Not every instance of Mapping or Sequence can be fed to json.dump() but those
# two generic types are the most specific *immutable* super-types of `list`,
# `tuple` and `dict`:

AnyJSON4 = Union[Mapping[str, Any], Sequence[Any], PrimitiveJSON]
AnyJSON3 = Union[Mapping[str, AnyJSON4], Sequence[AnyJSON4], PrimitiveJSON]
AnyJSON2 = Union[Mapping[str, AnyJSON3], Sequence[AnyJSON3], PrimitiveJSON]
AnyJSON1 = Union[Mapping[str, AnyJSON2], Sequence[AnyJSON2], PrimitiveJSON]
AnyJSON = Union[Mapping[str, AnyJSON1], Sequence[AnyJSON1], PrimitiveJSON]
JSON = Mapping[str, AnyJSON]
JSONs = Sequence[JSON]
CompositeJSON = Union[JSON, Sequence[AnyJSON]]

# For mutable JSON we can be more specific and use dict and list:

AnyMutableJSON4 = Union[dict[str, Any], list[Any], PrimitiveJSON]
AnyMutableJSON3 = Union[dict[str, AnyMutableJSON4], list[AnyMutableJSON4], PrimitiveJSON]
AnyMutableJSON2 = Union[dict[str, AnyMutableJSON3], list[AnyMutableJSON3], PrimitiveJSON]
AnyMutableJSON1 = Union[dict[str, AnyMutableJSON2], list[AnyMutableJSON2], PrimitiveJSON]
AnyMutableJSON = Union[dict[str, AnyMutableJSON1], list[AnyMutableJSON1], PrimitiveJSON]
MutableJSON = dict[str, AnyMutableJSON]
MutableJSONs = list[MutableJSON]
MutableCompositeJSON = Union[MutableJSON, list[AnyJSON]]

Then you can just import the types you need from your module. Mostly you'll just use JSON, JSONs, MutableJSON, and MutableJSONs.

Note: this was taken from https://github.com/DataBiosphere/azul/blob/9fa0f78800dbbc7bf4822063ff31811b3bb3f55b/src/azul/types.py which uses the Apache 2.0 license.

It was likely inspired by this thread https://github.com/python/typing/issues/182.

This Stack Overflow answer contains some other useful suggestions depending on what you know about your data.

leafmeal
  • 1,824
  • 15
  • 15
0

Quoting from https://github.com/python/typing/issues/182#issuecomment-1320974824:

All major type checkers now support recursive type aliases by default, so this should largely work:

JSON: TypeAlias = dict[str, "JSON"] | list["JSON"] | str | int | float | bool | None

Note that because dict is invariant, you might run into some issues e.g. with dict[str, str]. For such use cases you can use cast, and if you don't need mutability, something like the following might work:

JSON_ro: TypeAlias = Mapping[str, "JSON_ro"] | Sequence["JSON_ro"] | str | > int | float | bool | None
pradyunsg
  • 18,287
  • 11
  • 43
  • 96