43

There is a more general question here: In what situation should the built-in operator module be used in python?

The top answer claims that operator.itemgetter(x) is "neater" than, presumably, than lambda a: a[x]. I feel the opposite is true.

Are there any other benefits, like performance?

Community
  • 1
  • 1
thebossman
  • 4,598
  • 11
  • 34
  • 45

7 Answers7

29

You shouldn't worry about performance unless your code is in a tight inner loop, and is actually a performance problem. Instead, use code that best expresses your intent. Some people like lambdas, some like itemgetter. Sometimes it's just a matter of taste.

itemgetter is more powerful, for example, if you need to get a number of elements at once. For example:

operator.itemgetter(1,3,5)

is the same as:

lambda s: (s[1], s[3], s[5])
Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
  • 14
    "itemgetter is more powerful"? That seems backwards; I can do lots of things with a lambda that I can't do with itemgetter. It can be more compact, I guess. – DSM Jul 02 '12 at 02:37
  • 1
    @DSM: I think he meant powerful in terms of C performance as opposed to flexibility – jdi Jul 02 '12 at 02:41
  • 14
    I'm pretty sure Ned meant it's more powerful than the [] operator, which was what the questioner asked. Obviously it makes no sense that it's more powerful than arbitrary Python code in a lambda. – Nicholas Riley Jul 02 '12 at 02:42
  • 1
    I meant that itemgetter is more expressive in a compact way. Lambda is certainly more general, in that you can use any expression you like. – Ned Batchelder Jul 02 '12 at 02:42
  • 3
    And generality != power, if it were we'd all be writing assembly. – Gordon Wrigley Mar 24 '14 at 01:47
  • 2
    I think the right term is specific. Specificity is good *because* it is restrictive. – Mateen Ulhaq Oct 24 '18 at 22:19
17

There are benefits in some situations, here is a good example.

>>> data = [('a',3),('b',2),('c',1)]
>>> from operator import itemgetter
>>> sorted(data, key=itemgetter(1))
[('c', 1), ('b', 2), ('a', 3)]

This use of itemgetter is great because it makes everything clear while also being faster as all operations are kept on the C side.

>>> sorted(data, key=lambda x:x[1])
[('c', 1), ('b', 2), ('a', 3)]

Using a lambda is not as clear, it is also slower and it is preferred not to use lambda unless you have to. Eg. list comprehensions are preferred over using map with a lambda.

jamylak
  • 128,818
  • 30
  • 231
  • 230
  • 22
    Personally I find lambdas clearer in these cases. What's "clear" is not an objective claim. `itemgetter` and friends are nothing more than particular named lambdas (conceptually). I suspect that people who are *already comfortable* with lambdas in general (perhaps because they do a lot of functional programming) find `lambda` clearer (since they already know `lambda` and the already know `thing[index]`, so the lambda "just says what it means", whereas `itemgetter` requires memorising an additional name), while those who aren't as used to thinking with lambdas find `itemgetter` easier. – Ben Jul 02 '12 at 02:50
  • 1
    @Ben "named lambdas" is an oxymoron… lambdas are anonymous by definition – Hendrikto Jun 04 '21 at 11:24
15

Performance. It can make a big difference. In the right circumstances, you can get a bunch of stuff done at the C level by using itemgetter.

I think the claim of what is clearer really depends on which you use most often and would be very subjective

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
15

When using this in the key parameter of sorted() or min(), given the choice between say operator.itemgetter(1) and lambda x: x[1], the former is typically significantly faster in both cases:


Using sorted()

bm

The compared functions are defined as follows:

import operator


def sort_key_itemgetter(items, key=1):
    return sorted(items, key=operator.itemgetter(key))


def sort_key_lambda(items, key=1):
    return sorted(items, key=lambda x: x[key])

Result: sort_key_itemgetter() is faster by ~10% to ~15%.

(Full analysis here)


Using min()

enter image description here

The compared functions are defined as follows:

import operator


def min_key_itemgetter(items, key=1):
    return min(items, key=operator.itemgetter(key))


def min_key_lambda(items, key=1):
    return min(items, key=lambda x: x[key])

Result: min_key_itemgetter() is faster by ~20% to ~60%.

(Full analysis here)

norok2
  • 25,683
  • 4
  • 73
  • 99
  • since the gain is proportional, you could use a log-plot to convey that clearer. otherwise, thanks for posting the benchmark! – LudvigH Dec 15 '20 at 12:40
8

As performance was mentioned, I've compared both methods operator.itemgetter and lambda and for a small list it turns out that operator.itemgetter outperforms lambda by 10%. I personally like the itemgetter method as I mostly use it during sort and it became like a keyword for me.

import operator
import timeit

x = [[12, 'tall', 'blue', 1],
[2, 'short', 'red', 9],
[4, 'tall', 'blue', 13]]


def sortOperator():
    x.sort(key=operator.itemgetter(1, 2))

def sortLambda():
    x.sort(key=lambda x:(x[1], x[2]))


if __name__ == "__main__":
    print(timeit.timeit(stmt="sortOperator()", setup="from __main__ import sortOperator", number=10**7))
    print(timeit.timeit(stmt="sortLambda()", setup="from __main__ import sortLambda", number=10**7))    

>>Tuple: 9.79s, Single: 8.835s
>>Tuple: 11.12s, Single: 9.26s

Run on Python 3.6

user1767754
  • 23,311
  • 18
  • 141
  • 164
7

Leaving aside performance and code style, itemgetter is picklable, while lambda is not. This is important if the function needs to be saved, or passed between processes (typically as part of a larger object). In the following example, replacing itemgetter with lambda will result in a PicklingError.

from operator import itemgetter

def sort_by_key(sequence, key):
    return sorted(sequence, key=key)

if __name__ == "__main__":
    from multiprocessing import Pool

    items = [([(1,2),(4,1)], itemgetter(1)),
             ([(5,3),(2,7)], itemgetter(0))]

    with Pool(5) as p:
        result = p.starmap(sort_by_key, items)
    print(result)
eaglebrain
  • 356
  • 4
  • 7
5

Some programmers understand and use lambdas, but there is a population of programmers who perhaps didn't take computer science and aren't clear on the concept. For those programmers itemgetter() can make your intention clearer. (I don't write lambdas and any time I see one in code it takes me a little extra time to process what's going on and understand the code).

If you're coding for other computer science professionals go ahead and use lambdas if they are more comfortable. However, if you're coding for a wider audience. I suggest using itemgetter().

martineau
  • 119,623
  • 25
  • 170
  • 301
monkut
  • 42,176
  • 24
  • 124
  • 155