61

In Python, both list.sort method and sorted built-in function accepts an optional parameter named key, which is a function that, given an element from the list returns its sorting key.

Older Python versions used a different approach using the cmp parameter instead, which is a function that, given two elements from the list returns a negative number if the first is less than the second, zero if there are equals and a positive number if the first is greater. At some point, this parameter was deprecated and wasn't included in Python 3.

The other day I wanted to sort a list of elements in a way that a cmp function was much more easier to write than a key one. I didn't wanted to use a deprecated feature so I read the documentation and I found that there is a funtion named cmp_to_key in the functools module which, as his name states, receives a cmp function and returns a key one... or that's what I thought until I read the source code (or at least an equivalent version) of this high level function included in the docs

def cmp_to_key(mycmp):
    'Convert a cmp= function into a key= function'
    class K(object):
        def __init__(self, obj, *args):
            self.obj = obj
        def __lt__(self, other):
            return mycmp(self.obj, other.obj) < 0
        def __gt__(self, other):
            return mycmp(self.obj, other.obj) > 0
        def __eq__(self, other):
            return mycmp(self.obj, other.obj) == 0
        def __le__(self, other):
            return mycmp(self.obj, other.obj) <= 0
        def __ge__(self, other):
            return mycmp(self.obj, other.obj) >= 0
        def __ne__(self, other):
            return mycmp(self.obj, other.obj) != 0
    return K

Despite the fact that cmp_to_key works as expected, I get surprised by the fact that this function doesn't return a function but a K class instead. Why? How does it work? My guess it that the sorted function internally checks whether cmp is a function or a K class or something similar, but I'm not sure.

P.S.: Despite this weirdness, I found that K class is very useful. Check this code:

from functools import cmp_to_key

def my_cmp(a, b):
    # some sorting comparison which is hard to express using a key function

class MyClass(cmp_to_key(my_cmp)):
    ...

This way, any list of instances of MyClass can be, by default, sorted by the criteria defined in my_cmp

Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
matiascelasco
  • 1,145
  • 2
  • 13
  • 18
  • Here's the source code for cmp_to_key: https://github.com/python/cpython/blob/main/Lib/functools.py#L206 – Justin Harris Nov 30 '22 at 04:15
  • Why would sort functions need to check if something is a function or the K class? They could just call the key function or each element and compare them: `key(a) < key(b)`. So as long as `key` works like a callable, then it's fine. See https://en.wikipedia.org/wiki/Duck_typing – Justin Harris Nov 30 '22 at 04:17
  • 1
    I get `TypeError: cannot create 'functools.KeyWrapper' instances` when trying to define `class MyClass(cmp_to_key(my_cmp)):`. So I guess it is not a reliable usage to inherit like this - seems the `_functools` (C implementation version) does not support doing that. – wim Dec 13 '22 at 06:18

3 Answers3

53

No, sorted function (or list.sort) internally does not need to check if the object it received is a function or a class . All it cares about is that the object it received in key argument should be callable and should return a value that can be compared to other values when called.

Classes are also callable , when you call a class , you receive the instance of that class back.

To answer your question, first we need to understand (atleast at a basic level) how key argument works -

  1. The key callable is called for each element and it receives back the object with which it should sort.

  2. After receiving the new object, it compares this to other objects (again received by calling the key callable with the othe element).

Now the important thing to note here is that the new object received is compared against other same objects.

Now onto your equivalent code, when you create an instance of that class, it can be compared to other instances of the same class using your mycmp function. And sort when sorting the values compares these objects (in-effect) calling your mycmp() function to determine whether the value is less than or greater than the other object.

Example with print statements -

>>> def cmp_to_key(mycmp):
...     'Convert a cmp= function into a key= function'
...     class K(object):
...         def __init__(self, obj, *args):
...             print('obj created with ',obj)
...             self.obj = obj
...         def __lt__(self, other):
...             print('comparing less than ',self.obj)
...             return mycmp(self.obj, other.obj) < 0
...         def __gt__(self, other):
...             print('comparing greter than ',self.obj)
...             return mycmp(self.obj, other.obj) > 0
...         def __eq__(self, other):
...             print('comparing equal to ',self.obj)
...             return mycmp(self.obj, other.obj) == 0
...         def __le__(self, other):
...             print('comparing less than equal ',self.obj)
...             return mycmp(self.obj, other.obj) <= 0
...         def __ge__(self, other):
...             print('comparing greater than equal',self.obj)
...             return mycmp(self.obj, other.obj) >= 0
...         def __ne__(self, other):
...             print('comparing not equal ',self.obj)
...             return mycmp(self.obj, other.obj) != 0
...     return K
...
>>> def mycmp(a, b):
...     print("In Mycmp for", a, ' ', b)
...     if a < b:
...         return -1
...     elif a > b:
...         return 1
...     return 0
...
>>> print(sorted([3,4,2,5],key=cmp_to_key(mycmp)))
obj created with  3
obj created with  4
obj created with  2
obj created with  5
comparing less than  4
In Mycmp for 4   3
comparing less than  2
In Mycmp for 2   4
comparing less than  2
In Mycmp for 2   4
comparing less than  2
In Mycmp for 2   3
comparing less than  5
In Mycmp for 5   3
comparing less than  5
In Mycmp for 5   4
[2, 3, 4, 5]
Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176
  • 3
    `Why 2 is being compared to 4 twice`, and `why 5 is not compared to 2?`. Let me know. I don't have any idea, this is too overwhelming. What happens when `cmp_to_key(-1)` is returned to the `key` value? – Alok Oct 04 '19 at 15:53
  • @Alok I haven’t gotten my head around why 2 is compared with 4 twice but I can explain the 5 not being compared with 2. If you imagine the list having each value moved as you go, and the position of each item is determined from left to right - by the time you’re finding a home for 5 you have 2, 3, 4 on the left of it. We know those are already in the right order. So you compare 5 with the middle of those values, which is 3. 5 is not less than 3 so we already know it’s not less than anything on the left of 3 either, so we don’t need to check 2. – SDJMcHattie Feb 09 '20 at 21:12
  • This is really, really bad and so memory inefficient... – Ievgen Jun 02 '21 at 20:03
  • Read here https://docs.python.org/3/howto/sorting.html#the-old-way-using-the-cmp-parameter – pankaj Aug 19 '21 at 10:40
6

I just realized that, despite not being a function, the K class is a callable, because it's a class! and classes are callables that, when called, creates a new instance, initializes it by calling the corresponding __init__ and then returns that instance.

This way it behaves as a key function because K receives the object when called, and wraps this object in a K instance, which is able to be compared against other K instances.

Correct me if I'm wrong. I feel I'm getting into the, unfamiliar to me, meta-classes territory.

matiascelasco
  • 1,145
  • 2
  • 13
  • 18
  • Just for the record, Python uses "metaclasses" to refer to something even more meta and unfamiliar than this territory. But anyway, yes you have it right. Classes and functions are mostly interchangeable in Python - anything that takes a function to call and doesn't go out of its way to reject classes can also be given a class (this can be surprising because in many other languages classes are special and construction is only visually the same as a function call - but in Python if it looks like a function call, anything callable can go there). – mtraceur Dec 17 '22 at 19:53
1

I didn't look into the source, but i believe the result of the key function can also be anything, and therefore also a comparable object. And cmp_to_key just masks creation of those K objects, which are than compared to each other while sort does its work.

If i try to create a sort on departments and reverse room numbers like this:

departments_and_rooms = [('a', 1), ('a', 3),('b', 2)]
departments_and_rooms.sort(key=lambda vs: vs[0])
departments_and_rooms.sort(key=lambda vs: vs[1], reverse=True)
departments_and_rooms # is now [('a', 3), ('b', 2), ('a', 1)]

That's not what i want, and i think sort is only stable on each call, the documentation is misleading imo:

The sort() method is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal — this is helpful for sorting in multiple passes (for example, sort by department, then by salary grade).

The old style approach works because each result calling the K class returns a K instance and compares to results of mycmp:

def mycmp(a, b):                             
    return cmp((a[0], -a[1]), (b[0], -b[1]))

departments_and_rooms = [('a', 1), ('a', 3),('b', 2)]
departments_and_rooms.sort(key=cmp_to_key(mycmp))
departments_and_rooms # is now [('a', 3), ('a', 1), ('b', 2)]

It's an important difference, that one can't do multiple passes just out of the box. The values/results of the key function have to be sortable relative in order, not the elements to be sorted. Therefore is the cmp_to_key mask: create those comparable objects one needs to order them.

Hope that helps. and thanks for the insight in the cmp_to_key code, helped me alot also :)

seishin
  • 41
  • 4
  • I didn't get the same result after running your first piece of core. I got [('a', 3), ('b', 2), ('a', 1)] instead. – matiascelasco Dec 18 '15 at 01:26
  • 1
    You are right, was a copy paste error on my side. As of meta-classes, this K class usage is just normal object instantiation. – seishin Dec 20 '15 at 08:10
  • 1
    Can't understand how the whole stable sort thing is related to the topic. Can you please explain better? – matiascelasco Dec 29 '15 at 03:30