Is there a built-in way to use CPython built-ins to make an arbitrary callable behave as an unbound class method?

Question

In Python 2, it was possible to convert arbitrary callables to methods of a class. Importantly, if the callable was a CPython built-in implemented in C, you could use this to make methods of user-defined classes that were C layer themselves, invoking no byte code when called.

This is occasionally useful if you're relying on the GIL to provide "lock-free" synchronization; since the GIL can only be swapped out between op codes, if all the steps in a particular part of your code can be pushed to C, you can make it behave atomically.

In Python 2, you could do something like this:

import types
from operator import attrgetter
class Foo(object):
    ... This class maintains a member named length storing the length...

    def __len__(self):
        return self.length  # We don't want this, because we're trying to push all work to C

# Instead, we explicitly make an unbound method that uses attrgetter to achieve
# the same result as above __len__, but without no byte code invoked to satisfy it
Foo.__len__ = types.MethodType(attrgetter('length'), None, Foo)

In Python 3, there is no longer an unbound method type, and types.MethodType only takes two arguments and creates only bound methods (which is not useful for Python special methods like __len__, __hash__, etc., since special methods are often looked up directly on the type, not the instance).

Is there some way of accomplishing this in Py3 that I'm missing?

Things I've looked at:

functools.partialmethod (appears to not have a C implementation, so it fails the requirements, and between the Python implementation and being much more general purpose than I need, it's slow, taking about 5 us in my tests, vs. ~200-300 ns for direct Python definitions or attrgetter in Py2, a roughly 20x increase in overhead)
Trying to make attrgetter or the like follow the non-data descriptor protocol (not possible AFAICT, can't monkey-patch in a __get__ or the like)
Trying to find a way to subclass attrgetter to give it a __get__, but of course, the __get__ needs to be delegated to C layer somehow, and now we're back where we started
(Specific to attrgetter use case) Using __slots__ to make the member a descriptor in the first place, then trying to somehow convert from the resulting descriptor for the data into something that skips the final step of binding and acquiring the real value to something that makes it callable so the real value retrieval is deferred

I can't swear I didn't miss something for any of those options though. Anyone have any solutions? Total hackery is allowed; I recognize I'm doing pathological things here. Ideally it would be flexible (to let you make something that behaves like an unbound method out of a class, a Python built-in function like hex, len, etc., or any other callable object not defined at the Python layer). Importantly, it needs to attach to the class, not each instance (both to reduce per-instance overhead, and to work correctly for dunder special methods, which bypass instance lookup in most cases).

For reference, [dynamically adding callable to class as instance “method”](http://stackoverflow.com/q/14526652/364696) covers at least some options as of three years ago, but the solution doesn't push work to C layer for Python 3. — ShadowRanger, Oct 19 '16 at 01:33
Why don't you just write a C extension that does it? You might be able to find some way to do this, but in general whether something is or isn't done at the C level isn't part of Python's public API, so if you want to be sure something is done in C you need to do it in C yourself. — BrenBarn, Oct 19 '16 at 02:09
@BrenBarn: Always possible, but a much larger step than just repurposing built-ins. The incremental increase in work from the Python 2 approach would ideally not be going from an import and a one-liner to a full C extension module that must be built and installed on every system to enable this use case. I'd like to find a midway point that still preserves _most_ of the simplicity of writing in Python. — ShadowRanger, Oct 19 '16 at 02:19
this is a good question, because the UnboundMethod is gone for good. I never found it useful so having the function type with `__get__` is acutally much better for the language. See if `types.DynamicClassAttribute` is useful here. It should be used to create properties, so maybe it can help — JBernardo, Oct 19 '16 at 13:54
The thing is that a lot of the simplicity of writing in Python is traded off against not having any guarantees about things like "is this implemented in C or not". Your code in Python 2 is essentially relying on implementation details that you're not supposed to rely on. — BrenBarn, Oct 20 '16 at 05:20

score 1 · Accepted Answer · answered Nov 14 '19 at 20:53

Found a (probably CPython only) solution to this recently. It's a little ugly, being a ctypes hack to directly invoke CPython APIs, but it works, and gets the desired performance:

import ctypes
from operator import attrgetter

make_instance_method = ctypes.pythonapi.PyInstanceMethod_New
make_instance_method.argtypes = (ctypes.py_object,)
make_instance_method.restype = ctypes.py_object

class Foo:
    # ... This class maintains a member named length storing the length...

    # Defines a __len__ method that, at the C level, fetches self.length
    __len__ = make_instance_method(attrgetter('length'))

It's an improvement over the Python 2 version in one way, since, as it doesn't need the class to be defined to make an unbound method for it, you can define it in the class body by simple assignment (where the Python 2 version must explicitly reference Foo twice in Foo.__len__ = types.MethodType(attrgetter('length'), None, Foo), and only after class Foo has finished being defined).

On the other hand, it doesn't actually provide a performance benefit on CPython 3.7 AFAICT, at least not for the simple case here where it's replacing def __len__(self): return self.length; in fact, for __len__ accessed via len(instance) on an instance of Foo, ipython %%timeit microbenchmarks show len(instance) is ~10% slower when __len__ is defined via __len__ = make_instance_method(attrgetter('length')), . This is likely an artifact of attrgetter itself having slightly higher overhead due to CPython not having moved it to the "FastCall" protocol (called "Vectorcall" in 3.8 when it was made semi-public for provisional third-party use), while user-defined functions already benefit from it in 3.7, as well as having to dynamically choose whether to perform dotted or undotted attribute lookup and single or multiple attribute lookup each time (which Vectorcall might be able to avoid by choosing a __call__ implementation appropriate to the gets being performed at construction time) adds more overhead that the plain method avoids. It should win for more complicated cases (say, if the attribute to be retrieved is a nested attribute like self.contained.length), since attrgetter's overhead is largely fixed, while nested attribute lookup in Python means more byte code, but right now, it's not useful very often.

If they ever get around to optimizing operator.attrgetter for Vectorcall, I'll rebenchmark and update this answer.

It looks like [they're migrating `attrgetter` to Vectorcall in 3.11](https://github.com/python/cpython/issues/89116) (not yet released at this time), and the performance improvements claimed may make up the difference seen in my 3.7 tests. Of course, they've also made improvements to attribute lookups and the like over time, so I won't count performance chickens before 3.11 releases. — ShadowRanger, Aug 23 '22 at 19:02

Is there a built-in way to use CPython built-ins to make an arbitrary callable behave as an unbound class method?

1 Answers1

Linked