8

I am currently working on adding type hints to a project and can't figure out how to get this right. I have a list of lists, with the nested list containing two elements of type int and float. The first element of the nested list is always an int and the second is always a float.

my_list = [[1000, 5.5], [1432, 2.2], [1234, 0.3]]

I would like to type annotate it so that unpacking the inner list in for loops or loop comprehensions keeps the type information. I could change the inner lists to tuples and would get what I'm looking for:

def some_function(list_arg: list[tuple[int, float]]): pass

However, I need the inner lists to be mutable. Is there a nice way to do this for lists? I know that abstract classes like Sequence and Collection do not support multiple types.

kman
  • 95
  • 2
  • 1
    No, i don't think there is any way to do this with lists. lists in the python type system are homogenous – juanpa.arrivillaga May 11 '22 at 10:34
  • 3
    while tuples aren't mutable, your enclosing list is, so you can replace the entire tuple – joel May 11 '22 at 10:44
  • I'm not flagging it as a duplicate, but does this answer your question? [Existence of mutable named tuple in Python?](https://stackoverflow.com/questions/29290359/existence-of-mutable-named-tuple-in-python). The accepted answer suggests using module `recordclass` from pypi. – Stef May 11 '22 at 10:47
  • @Stef The post suggests using a third party class which I would rather avoid doing. I was looking for a solution more focused on type hinting, with clever sub-classing or something similar. – kman May 12 '22 at 09:42
  • @joel That's what I've ended up going with, thank you for the suggestion. – kman May 12 '22 at 09:42
  • @kman I see you found a way to drop your requirement of mutability, as joel suggested. Sadly I can't do that... I've started a bounty here. – Pedro A Sep 19 '22 at 03:31
  • 1
    It's not possible. See [Specify length of Sequence or List with Python typing module](https://stackoverflow.com/questions/44833822/specify-length-of-sequence-or-list-with-python-typing-module). – aaron Sep 19 '22 at 04:26

3 Answers3

4

I think the question highlights a fundamental difference between statically typed Python and dynamically typed Python. For someone who is used to dynamically typed Python (or Perl or JavaScript or any number of other scripting languages), it's perfectly normal to have diverse data types in a list. It's convenient, flexible, and doesn't require you to define custom data types. However, when you introduce static typing, you step into a tighter box that requires more rigorous design.

As several others have already pointed out, type annotations for lists require all elements of the list to be the same type, and don't allow you to specify a length. Rather than viewing this as a shortcoming of the type system, you should consider that the flaw is in your own design. What you are really looking for is a class with two data members. The first data member is named 0, and has type int, and the second is named 1, and has type float. As your friend, I would recommend that you define a proper class, with meaningful names for these data members. As I'm not sure what your data type represents, I'll make up names, for illustration.

class Sample:
    def __init__(self, atomCount: int, atomicMass: float):
        self.atomCount = atomCount
        self.atomicMass = atomicMass

This not only solves the typing problem, but also gives a major boost to readability. Your code would now look more like this:

my_list = [Sample(1000, 5.5), Sample(1432, 2.2), Sample(1234, 0.3)]

def some_function(list_arg: list[Sample]): pass

I do think it's worth highlighting Stef's comment, which points to this question. The answers given highlight two useful features related to this.

First, as of Python 3.7, you can mark a class as a data class, which will automatically generate methods like __init__(). The Sample class would look like this, using the @dataclass decorator:

from dataclasses import dataclass

@dataclass
class Sample:
    atomCount: int
    atomicMass: float

Another answer to that question mentions a PyPi package called recordclass, which it says is basically a mutable namedtuple. The typed version is called RecordClass

from recordclass import RecordClass

class Sample(RecordClass):
    atomCount: int
    atomicMass: float
TallChuck
  • 1,725
  • 11
  • 28
2

The mutability of the datastructure is not compatible with a static length and an invariant order of the types contained. It is not possible to statically analyze the sequence unpacking if you can sort, append, prepend or insert records into it.

Imagine the following snippet

def some_function(list_arg: list[int, float]): # Invalid Syntax :)
    myint, myfloat = list_arg # ok?

    list_arg.sort()
    myint, myfloat = list_arg # ??????

    if random.random() < .5:
        list_arg.insert(1, 'yet another type!')
    myint, myfloat = list_arg  # 50% chance of an actual runtime error
                               # fat chance for any static analysis!

If your sequence mutability is an imperative, write a union or other richer type hints for the potential types of the contained objects

def some_function(list_arg: list[list[A|B]]): pass

or use a supertype. Because int is duck type compatible with float https://mypy.readthedocs.io/en/latest/duck_type_compatibility.html#duck-type-compatibility :

def some_function(list_arg: list[list[float]]): pass

If your datastructure will not be actually mutated, then choosing a list instead of a tuple was the first mistake.

N1ngu
  • 2,862
  • 17
  • 35
0

One small addition to the TallChuck's response:

from recordclass import dataobject

class Sample(dataobject):
    atomCount: int
    atomicMass: float

The class Sample doesn't support namedtuple-like API. But it support some kind of dataclasses-like API.

intellimath
  • 2,396
  • 1
  • 12
  • 13