7

I am optimizing my code for performance, and when I use cProfile to check my code, a good deal of runtime is due to type annotation! Removing type annotation indeed improves the performance. You can see the output of cProfiler for the annotated and not-annotated one below.

Annotated code: Annotated code performance

Not-annotated code: Performance of the code without annotation

The annotated one clearly uses __call__, __new__, inner, __getitem__, __hash__, etc. methods in typing.py, and is twice slower than the not-annotated one!

My test code is simple:

from reil.datatypes import reildata

x = reildata.Categorical(name='cat', categories=('A', 'B', 'C', 'D', 'E'))

for _ in range(10000):
    [x(v) for v in ('A', 'B', 'C', 'D', 'E')]

And this is the relevant parts of the main code (datatypes.reildata.py):

from __future__ import annotations

import dataclasses
import itertools
from dataclasses import field
from typing import Any, Callable, Dict, Generic, Iterable, Iterator, List, Optional, Sequence, Tuple, TypeVar, Union, cast

from typing_extensions import Literal


T = TypeVar('T')

CategoricalType = TypeVar('CategoricalType')

Normal = Union[Literal[0], Literal[1], float]
Normalized = Union[Normal, Tuple[Normal, ...], None]


@dataclasses.dataclass(frozen=True)
class ReilSingleton(Generic[T]):
    name: str
    value: Optional[Union[T, Tuple[T, ...]]] = None
    is_numerical: Optional[bool] = field(
        repr=False, compare=False, default=None)
    normalized: Normalized = field(
        default=None, repr=False, compare=False)


@dataclasses.dataclass(frozen=True)
class Categorical(Generic[CategoricalType]):
    name: str
    categories: Optional[Tuple[CategoricalType, ...]] = None
    normal_form: Optional[Dict[CategoricalType, Tuple[Normal, ...]]] = field(
        default=None, init=False, repr=False, compare=False)

    def __post_init__(self):
        if self.categories is None:
            return

        cat_count = len(self.categories)
        normal_form = {}
        for i, c in enumerate(self.categories):
            temp = [0] * cat_count
            temp[i] = 1
            normal_form[c] = tuple(temp)

        self.__dict__['normal_form'] = normal_form

    def __call__(self,
                 data: Union[CategoricalType, Tuple[CategoricalType, ...]]
                 ) -> ReilSingleton:
        normal_form = self.normal_form
        categories = cast(Tuple[CategoricalType, ...], self.categories)

        if normal_form is None:
            normalized = None
        elif data in categories:
            normalized = normal_form[cast(CategoricalType, data)]
        else:
            try:
                normalized = tuple(
                    itertools.chain(
                        *(normal_form[d]
                          for d in cast(Tuple[CategoricalType, ...], data))))
            except KeyError:
                raise ValueError(
                    f'{data} is not '
                    f'in the categories={categories}.')

        instance = ReilSingleton[CategoricalType](
            name=self.name, is_numerical=False, value=data,
            normalized=normalized)
        instance.__dict__['categories'] = categories
        instance.__dict__['dict_fields'] = ('name', 'value', 'categories')

        return instance

What should I do if I want to keep annotations? Am I doing something wrong? My Python interpreter is Python 3.7.8 (cPython), and I run the tests on Windows 10.

  • 2
    The problem is having `Tuple[CategoricalType, ...]` as an argument to `cast`. `cast` is a function and therefore arguments need to be resolved before the call. So each time `__call__` is used it needs to resolve `Tuple[CategoricalType, ...]` and pass it's value to `cast`. You need to restructure it in a way so as to not need the `cast` or hand it a type argument that doesn't need to be resolved. – Axe319 Apr 08 '21 at 12:51
  • 2
    I removed `cast()` functions, but still it uses a significant portion of runtime for type annotations. Part of the issue is with `TypeVar` and `Generic`. So, every time I create an instance of my object (which has type `Generic[T]`), it calls `__new__` and `__call__` in typing.py – Sadjad Anzabi Zadeh Apr 09 '21 at 06:12

0 Answers0