10

I'm working on code bases with extensive type hints, checked by mypy. There's several instances where we have a mapping from an enum.Enum or other small finite set of statically known values (typing.Literal) to fixed values, and thus using a dictionary is convenient:

# GOOD
from enum import Enum, auto

class Foo(Enum):
   X = auto()
   Y = auto()

lookup: dict[Foo, str] = {Foo.X: "cool", Foo.Y: "whatever"}

print(lookup[Foo.X])

However, this dictionary doesn't have to be exhaustive (aka total): mypy is perfectly happy with it having missing keys, and indexing with a missing key will fail at runtime. This can happen in practice easily with a large enum (forgetting a member when defining the dict), or when adding a member to an existing enum (especially if the lookup is a completely different file).

For instance, this passes mypy --strict just fine, but fails at runtime because we 'forgot' to update lookup itself:

# BAD
from enum import Enum, auto

class Foo(Enum):
   X = auto()
   Y = auto()
   Z = auto() # NEW

lookup: dict[Foo, str] = {Foo.X: "cool", Foo.Y: "whatever"}
 
print(lookup[Foo.Z]) # CHANGED

I'd love to be able to mark specific dictionaries/mappings as total/exhaustive, meaning, for instance, mypy will give an error about the definition of lookup in the BAD example above.

  1. Can this be annotated with Python's current type hints as a generic type, for any Enum or Literal[...] key type? (For instance, best-case syntax we'd hope for: lookup: ExhaustiveDict[Foo, str] = {...} or lookup: ExhaustiveDict[Literal[1, 2], str] = {1: "a", 2: "b"}.)
  2. If not, can it be done for a specific pair of key/value types? (For instance, reasonable syntax might be lookup: ExhaustiveDictFooTo[str] = {...} and/or lookup: ExhaustiveDictFooToStr = {...}, as long as the definition of those types is reasonable.)

I'm happy to change the exact syntax with which we build the dictionary, but the closer it is to {Foo.X: "cool", Foo.Y: "whatever"} the better.


Additional notes for background/to be clear about what we understand:

  • We're currently using a work around of exhaustive if statements, but it's annoying to go from a compact dict to a whole function:
    from typing import NoReturn
    def exhaustive(val: NoReturn) -> NoReturn:
        raise NotImplementedError(val)
    
    def lookup(val: Foo) -> str:
        if val is Foo.X:
            return "cool"
        elif val is Foo.Y:
            return "whatever"
        else:
            exhaustive(val)
    
    If Foo is later changed to include Z, we get an error on the last line like Argument 1 to "exhaustive" has incompatible type "Literal[Foo.Z]"; expected "NoReturn", meaning we haven't handled that case earlier in the if (the message is not immediately obvious, but it's one ends up just pattern-matching what it means and is far better than nothing). (Presumably this could also use match/case, but we're still on Python 3.9, not 3.10.)
  • This applies equally well to using Literal[1, 2, 3] or Literal["foo", "bar", "baz"] as a key type, in addition to enum.Enum.
  • This is has some overlap with typing.TypedDict and its total=True default, but AFAICT, that's limited to string keys written literally into the TypedDict definition (so we'd need to convert enums to strings and have additional functionality that verifies the TypedDict definition actually matches the enum).
  • I'm basically asking how to write a Python equivalent to TypeScript's Record type, something like Record[Foo, str] or Record[Literal["foo", "bar"], str] (equivalent to Record<"foo" | "bar", string> in Typescript).
huon
  • 94,605
  • 21
  • 231
  • 225
  • 1
    If you need each element of an enum to map to something, just add it to the enum directly as an attribute. – Mad Physicist Apr 27 '22 at 01:55
  • why does `typing.Literal` not work for you? – juanpa.arrivillaga Apr 27 '22 at 03:01
  • @MadPhysicist Thanks. If I understand what you're suggesting correctly, that works in limited cases, but doesn't scale well (many different mappings may lead to a huge enum definition), work for external enums (can't change the definition), or lead to good architecture (putting downstream concerns into the enum definition itself). NB. I'm assuming you mean something like `X = (0, "cool", 1.23, ...)`, `Y = (1, "whatever", 4.56, ...)` (plus the appropriate `__new__`/`__init__` overrides to get the `.value` and the custom attributes set correctly). – huon Apr 27 '22 at 03:09
  • @juanpa.arrivillaga, I'm intrigued but not sure how `Literal` acts like an exhaustive `dict`. Could you be a little more specific about how I should change my code examples to use `Literal` and achieve my goal? – huon Apr 27 '22 at 03:11
  • `lookup: dict[Literal[Foo.X, Foo.Y], str] = {Foo.X: "cool", Foo.Y: "whatever"}` – juanpa.arrivillaga Apr 27 '22 at 03:14
  • @juanpa.arrivillaga, thanks, unfortunately `lookup: dict[Literal[Foo.X, Foo.Y, Foo.Z], str] = {Foo.X: "cool", Foo.Y: "whatever"}` passes mypy just fine, plus it's rather verbose for a large enum (I have one with 29 values), so doesn't seem like the full story... however, that's slightly better in some ways because `lookup`'s type will need to be updated if `Foo`'s members are changed and it's indexed by values of type `Foo`, and one might remember to update the dict itself at the same time. If you add it as an answer and discuss the downsides, I'll upvote :) – huon Apr 27 '22 at 03:21
  • @huon. Can you replace `Literal[Foo.X, Foo.Y, Foo.Z]` with `Literal[Foo._values_]` or so? – Mad Physicist Apr 27 '22 at 05:41
  • This might get you started: https://stackoverflow.com/q/54489776/2988730 – Mad Physicist Apr 27 '22 at 05:44
  • @MadPhysicist thanks again. I think writing `Literal` with all of the enum members and the enum `Foo` itself are pretty much equivalent and so that still doesn't address the main concern (making sure the dict keys are exhaustive). Also, that linked answer doesn't seem to address any of my concerns about `TypedDict` in my question? – huon Apr 27 '22 at 06:32
  • In a nutshell **no**, the aim you describe can't be done with the type hints Python currently provides. You'll have to maintain both the exhaustiveness and the totality in full by hand, and you'll run into problems further down the road when narrowing. – bad_coder May 02 '22 at 10:01

2 Answers2

10

(...) mark specific dictionaries/mappings as total/exhaustive, meaning, (...) mypy will give an error about the definition of lookup in the BAD example

IOW, what the bold sentence says is: Create a type hint dependency of totality/exhaustiveness from lookup to Foo. Roughly in UML:

enter image description here

Dependency meaning lookup depends on changes to Foo, but with the conflated quadruple requirement that:

A. The dependency be implemented using only type hints (hence a static type checker dependency, not a run-time dependency).

B. The dependency automatically reflect changes in Foo to TypedDict without a need to rewrite type hints upon changing Foo. (This one is completly over the top.)

C. The dependency cause mypy to issue a warning when it's not satisfied.

D. The lookup dictionary keep a Totality relation to the Foo members.

Simple answer is: NO. Python doesn't have one native type hint to establish such a dependency; and it can't be achieved by combining static type hints without requiring rewrites that keep Foo and the TypedDict in sync. So the only choice is resorting to a run-time implementation, or rewriting the TypedDict definition to reflect changes to Foo. (I.e: It's not possible to satisfy requirements A thogether with B.)

(The hard part is demonstrating "why not", so the following points try to build up an incremental demonstration addressing the several possibilities the question mentions.)

1. Declaration

1.1. Literal and TypedDict have to be written in full at declaration, their syntax rules don't allow writing a dynamic declaration. So writing the dependency between lookup: dict and Foo: Enum into the type hints at declaration can't be done. (It's not possible so satisfy requirement B and A together.)

See the PEP quotes below: It's not possible to declare Literal[*Foo] by unpacking or other run-time means, and the same goes for TypedDict because it doesn't have a constructor (other than explicit class syntax and the alternative syntax) that would allow the declaration to be populated as a function of Enum Foo or dict lookup type hint to capture the dependency without writing it explicitly in full.

PEP 586 - Illegal parameters for Literal at type check time

The following parameters are intentionally disallowed by design:

Arbitrary expressions like Literal[3 + 4] or Literal["foo".replace("o", "b")].

(...)

Any other types: for example, Literal[Path], or Literal[some_object_instance] are illegal. This includes typevars: if T is a typevar, Literal[T] is not allowed. Typevars can vary over only types, never over values.

And specific to the TypeDict:

PEP 589 – TypedDict: Type Hints for Dictionaries with a Fixed Set of Keys

Abstract

This PEP proposes a type constructor typing.TypedDict to support the use case where a dictionary object has a specific set of string keys, each with a value of a specific type. Class-based Syntax String literal forward references are valid in the value types

This PEP proposes a type constructor typing.TypedDict to support the use case where a dictionary object has a specific set of string keys, each with a value of a specific type.

1.2. Using forward references wouldn't change the fact that references to the Enum members can only be written into the values not the keys (using class syntax) (requirement A is not met).

class Movie1(TypedDict):
    cool: "Literal[Foo.X]"
    whatever: "Literal[Foo.Y]"


class Movie2(TypedDict):
    cool: Literal[Foo.X]
    whatever: Literal[Foo.Y]

1.3. The keys in the TypedDict have to be strings but the strings can't have dots (it conflicts with dotted syntax) and can't be written as string literals. So the following three examples won't work (requirement A is not met):

class Movie3(TypedDict):  # Illegal  syntax
    "Foo.X": str
    "Foo.Y": str

class Movie4(TypedDict):  # Illegal  syntax
    Foo.X: str
    Foo.Y: str

# using a dotted syntax that has no corresponding variable also doesn't work
class Movie5(TypedDict):  # Illegal syntax
    a.x: str
    b.y: str

1.4. The previous point also means you could use an alias for the Enum members in order to write them into the TypedDict, the following code would work:

However, this would again defeat the question's main purpose of not having to write out and maintain a second group of declarations that need to be updated to reflect changes to the Enum. (Requirement B is again not met.)

some_alias1 = Foo.X
some_alias2 = Foo.Y

class Movie6(TypedDict):
    some_alias1 : str
    some_alias2 : str

lookup_forward_ref3: Movie3 = {'some_alias1': "cool", 'some_alias2': "whatever"}

1.5 TypeDict's Alternative Syntax

Using the Alternative Syntax of TypeDict (opposed to class syntax) allows to circumvent the dotted syntax problem mentioned earlier (in 1.3.), the following passes with mypy 0.931

class Foo(Enum):
   X = auto()
   Y = auto()
   Z = auto() # NEW

Movie = TypedDict('Movie',
                  {
                     'Foo.X': str,
                     'Foo.Y': str,
                     'Foo.Z': Literal[Foo.X]},   # just as a Literal example
                  total=False)

lookup2: Movie = {'Foo.X': "cool", 'Foo.Y': "whatever", 'Foo.Z': Foo.X}

This is one step closer to the possible alternatives you were asking for:

I'm happy to change the exact syntax with which we build the dictionary, but the closer it is to {Foo.X: "cool", Foo.Y: "whatever"} the better.

However, you'll still have to maintain the TypedDict declaration in sync with changes to the Enum Foo (so it doesn't satisfy requirements B but requirements A, C and D are pretty close). If for example you tried populating the key-values with something more dynamic:

part_declaration = {
                     'Foo.X': str,
                     'Foo.Y': str,
                     'Foo.Z': Literal[Foo.X]}

Movie = TypedDict('Movie',
                  part_declaration,
                  total=False)

Mypy would remind you that:

your_module.py:27: error: TypedDict() expects a dictionary literal as the second argument
your_module.py:31: error: Extra keys ("Foo.X", "Foo.Y", "Foo.Z") for TypedDict "TypedDict"

1.6 Use of Final Values and Literal Types

It should be emphasized that using Literals as key's to the TypedDict is only legal for string literals not Enum literals (notice the bolds in the PEP quote). So, the TypedDict has to be declared in full; looking at Enum Literals for a solution won't change that fact. (Requirement B again not met).

PEP 589 – TypedDict: Type Hints for Dictionaries with a Fixed Set of Keys

Use of Final Values and Literal Types

Type checkers should allow final names (PEP 591) with string values to be used instead of string literals

Type checkers are only expected to support actual string literals, not final names or literal types,

Mypy also considers Enum Literals as final, see Extra Enum checks but that doesn't superseed the above mentioned string literal limitation.

2 Relation between Literal[YourEnum.member] and YourEnum

In most cases there's no difference between typing a variable as the_var: Foo or the_var: Literal[Foo.X, Foo.Y, Foo.Z]] if the Literal has all the Enum members because it would accept the exact same types. The question mentions using Literals over just Foo (the Enum members are subclasses of the Enum so nominal subtyping rules apply). But for the purpose of the question using Literals won't solve the problem of creating a type hint dependency between lookup and Foo that reflects changes to the later without requiring rewrites (again requirement B not satisfied). The following two declarations are equivalent:

class Foo(Enum):
   X = auto()
   Y = auto()
   Z = auto()

var1: Foo
var2: Literal[Foo.X, Foo.Y, Foo.Z]

var1 = Foo.X
var1 = Foo.Y
var1 = Foo.Z

var2 = Foo.X
var2 = Foo.Y
var2 = Foo.Z

Lets now look at the 2 properties mentioned in the question:

3. Totality

A property of the TypeDict. As said before the TypeDict definition has to be written in full at declaration - there's no way for the TypedDict keys to reflect changes to Enum Foo without writing those changes explicitly in the declaration. (Requirement B again not met.)

TypedDict is the definition of a dependency on the type of it's values and the strings values of its keys. What totality aims to capture is a dependency between instances of TypedDict and the type itself. So trying to express a relationship of totality dependence to another type can only be done by explicitly coding that dependency. (Without satisfying requirement B it's possible to satisfy requirements A and C but you'll have to manually maintain those dependencies up-to-date).

4. Exhaustiveness

This property of the Enum's (see PEP 596 - Interactions with enums and exhaustiveness checks and mypy - Exhaustiveness checking) is mentioned in the question but it's orthogonal to requirements A, B.

Exhaustiveness is logic (the if/else brach) related to data (the Enum). It allows a run-time implementation that the static type checker verifies, it is not a type hint! (So it's not even in the league of requirement A - because it's not a type hint; it again doesn't satisfy requirement B; but it can satisfy requirement C as you've implemented it; it circumvents requirement D by implementing an explicit run-time mapping to the string constants instead of using a type hint TypedDict to maintain the dependency between lookup strings and Enum members).

5. Conclusion

If you notice requirement B is never satisfied using static type hint checks (you have to write the type hints and mantain them). Most developers would go straight for a unittest or run-time check (or just let the KeyError be thrown because it's easier to...):

class Foo(Enum):
    X = auto()
    Y = auto()
    Z = auto() # NEW

    @classmethod
    def totality(cls, lookup: dict[str, Any]):
        for member in cls:
            if '{}.{}'.format(cls.__qualname__, member.name) not in lookup.keys():
                raise KeyError  # lookup isn't total to Enum.

The main use of type hints is hinting to the developer what types are acceptable. Your use departs from that by trying to establish a mapping between two sets of permissable values and turning those values together with the mapping into a type. The point of such use isn't giving you a warning if you forget to maintain something in your code, but to remind you what types (in this case mapping between values) are acceptable.

6. Addressing the questions:

I'd love to be able to mark specific dictionaries/mappings as total/exhaustive

Can be done with TypedDict. Declare the type and maintain it current to Enum Foo.

meaning, for instance, mypy will give an error about the definition of lookup in the BAD example above.

Orthogonal to the previous statement! That has nothing to do with totality, the TypedDict keeps totality in relation to its type definition. Keep it's totality in relation to Foo's definition up to date and problem solved.

  1. Can this be annotated with Python's current type hints as a generic type, for any Enum or Literal input? (For instance, lookup: ExhaustiveDict[Foo, str] = {...}.)

This question doesn't make sense. The type hint you give as example works for the Enum members as shown in (2.) and you don't specify what any Literal means...? I don't see how Generic would help here.

  1. If not, can it be done for a specific pair of keys/values? (For instance, lookup: ExhaustiveDictFooTo[str] = {...} and/or lookup: ExhaustiveDictFooToStr = {...}.

Depends on the possible lookup dictionaries, you only give a 1:1 mapping between Enum members and string Literals so nothing could be simpler, it would look like this:

combo = tuple[tuple[Literal[Foo.X], Literal['cool']], tuple[Literal[Foo.Y], Literal['whatever']]]

Problem being it's not possible to express a 1:1 key-value relationship in a dictionary using type hints. So this is what turning values into types looks like in the extreme...

but it's annoying to go from a compact dict to a whole function

The straightforward solution is writing a TypedDict mapping to the Enum members' names as keys (as mentioned in 1.5) together with a lookup instance. The type hint itself could be written as

class Foo(Enum):
   X = auto()
   Y = auto()
   Z = auto() # NEW

Movie = TypedDict('Movie',
                  {
                     'Foo.X': str,
                     'Foo.Y': Literal['whatever'], # just as a Literal example

                     'Foo.Z': Literal[Foo.X]},  # just as a Literal example
                  total=False)

lookup2: Movie = {'Foo.X': "cool", 'Foo.Y': 'whatever', 'Foo.Z': Foo.X}

If you want mypy to give you a warning you can also use the Exhaustiveness check (in 4.) but that warning is meant to remind you of oversights in writing your logic not the data! nor that type hints are out-of-date.

bad_coder
  • 11,289
  • 20
  • 44
  • 72
  • Thanks for taking the time to try to answer this. However, there's quite a few parts that I don't understand: 1. You don't seem to explain why my desire here is a "maintainability nightmare". Could you expand? 2. The question doesn't propose using `Literal` to listing enum members (it's redundant with `Foo`, as you point out), but rather making sure a solution _also_ works with something like `Literal["foo", "bar", ..]` (note `str` members), just enums. I'd be able to understand your answer more easily if it cut down by keeping that focus in mind :) – huon May 03 '22 at 01:23
  • 3. You imply that one option would be "maintaining a Literal ... by hand", but this doesn't seem to give any sort of totality checking, so you'll need to expand on how that option could work? 4. Relatedly, using `TypeGuard` doesn't seem right: the 'narrowed' type is the same as input type (https://mypy-play.net/?mypy=latest&python=3.10&gist=f957e7b6d389ca19625d3b1a561fa4e5) (as you point out elsewhere), and there's no totality checking anyway. If using runtime checks, a plain `assert` or test is better: it fails, rather than take a unexpected code path. (Thanks again for taking the time.) – huon May 03 '22 at 01:26
  • @huon your initial post was quite a handful with several issues needing to be addressed so I posted *"as is"* since it was also likely you wouldn't reply at all. I'm not satisfied with how I handle some aspects in my answer so I was intending on revising it. It'll try to take some time tomorrow. But yes, each of the reservations you're raising now will require one or more paragraphs likely with a revised code snippet for each. – bad_coder May 03 '22 at 01:46
  • My apologies for being misleading, the 'notes' in my post are background to indicate things that I understand (to try to short-circuit potential suggestions/ideas that don't work, and provide more context), rather than things needing to be addressed/answered. :) – huon May 03 '22 at 02:49
  • @huon Sorry for not getting this earlier, writing a post like this just takes time and there's no way around it. The answer pretty much tackles every aspect you've mentioned. The hard part was untangling the conflated requirements, but in so far as it's possible I addressed them separately. I am expecting you to award the bounty. – bad_coder May 05 '22 at 02:02
  • Wow, thank you for writing so much. I feel that you haven't asked for enough clarification on the questions, or given concrete enough advice for how to handle this in future. The question you're saying doesn't make sense is a little misleading (sorry!): "any literal" is my desire to have a `dict[Literal[1, 2], str]` that is total (must values for both `1` and `2` and keys), similar to how the overall question is asking for `dict[SomeEnum, str]` that must specify values for all enum members. – huon May 05 '22 at 05:01
  • @huon it's not my job to ask for clarification as much as it's your job to not change requirements invalidating any answers (that's actually a hard SO rule). The one thing that emerged from the question is a lot of misunderstandings on your part about basic Python type hinting rules. Now, I'm always happy to work with OP's towards solving problems, you don't have to thank me for that. But in keeping with site etiquette and basic justice the problem's been solved as formulated so I will request you award the bounty and upvote after which I'll be me more than happy to expand on the answer. – bad_coder May 05 '22 at 05:06
  • 1
    As you might infer from my profile, I'm familiar with how SO works. I apologise that I had wording that wasn't clear to you, however the requirements haven't changed (I've now explicitly tweaked the question wording, but I'm not willing to fight your unnecessary attacks like 'over the top'/'a lot of misunderstanding'/... any more). `Record` in TypeScript proves that this is perfectly reasonable to do with static types. I suspect there's nothing about Python's 'basic hinting rules' that stops it other than a PEP and implementation (e.g. probably could be a MyPy plugin with reasonable syntax). – huon May 05 '22 at 05:39
2

The current type hint system doesn't support this well. To get something similar done, I have had to reach for meta-programming.

In your case, I would construct the lookup dictionary based on the enum. Doing it this way, you can move the maintenance entirely into one location in code and it forces the mapping to become exhaustive (since it's constructed from the original set of values).

from enum import Enum, auto

class Foo(Enum):
   X = auto()
   Y = auto()

lookup: dict[Foo, str] = {val:key for key,val in Foo._member_map_.items()}

print(lookup[Foo.X])

However, if you are dynamically constructing the mappings or editing them dynamically all bets are off.


Another option is to shove the look ups into the Enum by giving it a reverse_lookup implementation.

from enum import Enum

class ApplicationSpecificError(Exception): ...

class Foo(Enum):
  x = 'cool'
  y = 'whatever'

  @classmethod
  def reverse_lookup(cls, key):
    try:
      return cls._value2member_map_[key] 
    except KeyError as err:
      raise ApplicationSpecificError(f'Foo.{key} is not tracked by Foo') from None

print(Foo.reverse_lookup('nope')) # ApplicationSpecificError

I see that you've think that match-case would give you exhaustive typing, but it doesn't. You can forget about a case and it'll be happy to ignore it entirely. Mypy and other tools might implement an option to test for match exhaustiveness/totality, but matching by itself doesn't give you either.

plunker
  • 1,190
  • 2
  • 14
  • Thanks for taking the time to look at this. Both options here seem to doing something very different to my question: the first `lookup` seems to be the same as the `name` attribute (`Foo.X.name`), and the second seems to be the same as `Foo("nope")` (plus some exception wrapping). That is, neither are mapping enum members to some arbitrary other value. For the match-case, it's just a slightly neater alternative to the `if`/`else`: `case _: exhaustive(val)` works just as well for exhaustiveness checking as `else: exhaustive(val)`. – huon May 05 '22 at 00:54