11

I have a dict with of which the types of the keys and values are fixed. I want to define the types in a TypedDict as follows:

class MyTable(TypedDict):
    caption: List[str]
    header: List[str]
    table: pd.DataFrame
    epilogue: List[str]

I have function that returns a MyTable. I want to define first an empty (Typed)dict and fill in keys and values.

def returnsMyTable():
    result = {}
    result['caption'] = ['caption line 1','caption line 2']
    result['header'] = ['header line 1','header line 2']
    result['table'] = pd.DataFrame()
    result['epilogue'] = ['epilogue line 1','epilogue line 2']
    return result

Here MyPy complains that a type annotation for result is needed. I tried the this:

result: MyTable = {}

but then MyPy complains that the keys are missing. Similarly, if I define the keys but set the values to None, it complains about incorrect types of the values.

Is it at all possible to initialize a TypedDict as an empty Dict first and fill in the keys and values later? The docs seem to suggest it is.

I guess I could first define the values as variables and assemble the MyTable later but I'm dealing with legacy code that I'm adding type hinting to. So I'd like to minimize the work.

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
MaartenB
  • 383
  • 1
  • 2
  • 12
  • Is there any reason why ``returnsMyTable`` does not just include one ``dict`` literal, assigning keys and values immediately? – MisterMiyagi Jan 13 '21 at 14:54
  • Since the derivation of the values is rather complex, creating the `MyTable` in 1 line is infeasible. Like I wrote, I could first define the values as separate values but I'm dealing with legacy code that's been set up in this way. This works `result = MyTable(header=[''],caption=[''],table=pd.DataFrame(),epilogue=[''])` but seems rather unpythonic – MaartenB Jan 15 '21 at 07:40
  • I don't think we can really help you unless you clarify what the issue with "legacy code" is. The only general solution is to `# type: ignore` the issue if you know the result is correct. – MisterMiyagi Jan 15 '21 at 08:04
  • With "legacy code" I simply mean a pre-existing codebase to which I'm adding type hinting. There is no issue with it, but other than adding the type-hints I want to make minimal changes. – MaartenB Jan 24 '21 at 13:03

1 Answers1

2

What you might want here is to set totality, but I'd think twice about using it.

Quoting the PEP

By default, all keys must be present in a TypedDict. It is possible to override this by specifying totality. Here is how to do this using the class-based syntax:

class MyTable(TypedDict, total=False):
    caption: List[str]
    header: List[str]
    table: pd.DataFrame
    epilogue: List[str]

result: MyTable = {}
result2: MyTable = {"caption": ["One", "Two", "Three"]}

As I said, think twice about that. A total TypedDict gives you a very nice guarantee that all of the items will exist. That is, because MyPy won't allow result to exist without "caption", you can safely call cap = result["caption"].

If you set total=False, that guarantee goes away. On the assumption that you're using your MyTable much more commonly than you're making it, getting the additional safety assurances when you use it is probably a good trade.

Personally, I'd reserve total=False for cases where the creation code sometimes genuinely does leave things out and any code that uses it has to handle that. If it's just a case of taking a few lines to initialise, I'd do it like this:

def returnsMyTable():
    result = {}
    result_caption = ['caption line 1','caption line 2']
    result_header = ['header line 1','header line 2']
    result_table = pd.DataFrame()
    result_epilogue = ['epilogue line 1','epilogue line 2']
    result = {
        "caption": result_caption, 
        "header": result_header, 
        "table": result_table, 
        "epilogue": result_epilogue
    }
    return result
Josiah
  • 1,072
  • 6
  • 15
  • Thanks. I agree that removing the totality requirement is undesirable (defeats the purpose of type-hints). Your last suggestion is what I meant with "defining the values as variables and assemble the `MyTable` later". As I said, the [docs](https://mypy.readthedocs.io/en/stable/type_inference_and_annotations.html#explicit-types-for-collections) show that it's possible for regular `dict`s (`d: Dict[str, int] = {}`). But apparently not for `TypedDict`s. – MaartenB Jan 24 '21 at 11:52
  • 1
    That's correct. All that the regular dict notation is saying is that everything in `keys()` is a string and everything in `values()` is an int. MyPy isn't checking that your `d` will satisfy that once you put some stuff in it a few lines down. It's checking that it already does. There's nothing in keys or values which isn't a string or int (because there's nothing in keys or values!) and so it's legal. – Josiah Jan 24 '21 at 12:12