0

Overriding the length method on namedtuple objects in Python is a fair bit more tedious than you might expect. The naive approach,

from collections import namedtuple

class Rule(namedtuple('Rule', ['lhs', 'rhs'])):
    def __len__(self):
        return len(self.rhs)

r = Rule('S', ['NP', 'Infl', 'VP'])
new_r = r._replace(lhs='CP') # raises a TypeError

doesn't work. If you inspect the actual source code of the class (which is available as the _source attribute), you can see that _make (which _replace calls and which raises the error) is implemented like this:

@classmethod
def _make(cls, iterable, new=tuple.__new__, len=len):
    'Make a new Rule object from a sequence or iterable'
    result = new(cls, iterable)
    if len(result) != 2:
        raise TypeError('Expected 2 arguments, got %d' % len(result))
    return result

Interestingly, it checks to make sure that the length of the return value is 2. This makes it more difficult to override the __len__ method on the tuple because _make will complain if it returns a value with a length other than 2.

It's possible to prevent this behavior by passing a "len" function that always returns 2 to _make:

from collections import namedtuple

class Rule(namedtuple('Rule', ['lhs', 'rhs'])):
    def _make(self, *args, len=lambda _: 2, **kwargs):
        return super()._make(*args, len=len, **kwargs)

    def __len__(self):
        return len(self.rhs)

r = Rule('S', ['NP', 'Infl', 'VP'])
new_r = r._replace(lhs='CP') # fine

My question is, why is this length check necessarily in the first place, and is it safe to override _make so it doesn't do it?

iafisher
  • 938
  • 7
  • 14
  • 4
    "Overriding the length method on namedtuple objects in Python is a fair bit more tedious than you might expect" - why would you do that? It seems like a terrible idea that would cause all kinds of bugs, even besides this bug. – user2357112 May 02 '17 at 20:27

1 Answers1

6

_make checks the return value's length because namedtuples are fixed-length, and _make has to enforce that. If it didn't, you could do

Point = namedtuple('Point', ['x', 'y'])
p1 = Point._make([1, 2, 3])
p2 = Point._make([1])

and get a Point with no y and a Point with an extra entry dangling off the end.

_make can't check the argument's length, because the argument could be an arbitrary iterable that doesn't support len, so the return value's length is the most convenient thing to check.

Do not override _make to bypass this check. Your object is far enough from the concept of a namedtuple - heck, far enough from the concept of a tuple - that you shouldn't be using namedtuple or any tuple subclass at all. Just write a regular class.

user2357112
  • 260,549
  • 28
  • 431
  • 505