-2

Why does setdefault not increment by 1 for every occurrence in a inside a dictionary comprehension, but it does in a loop? What's going on here?

Alternative solutions are great. I'm mostly interested in understanding why this doesn't work.

A loop with setdefault works

a = [1,1,2,2,2,3,3]

b = {}

for x in a:
    b[x] = b.setdefault(x, 0) + 1

b

Out[4]: {1: 2, 2: 3, 3: 2}

A dictionary comprehension with setdefault doesn't work

b = {k: b.setdefault(k, 0) + 1 for k in a}

b

Out[7]: {1: 1, 2: 1, 3: 1}

Update

Thanks for the answers, I wanted to try timing the solutions.

def using_get(a):
    b = {}
    for x in a:
        b[x] = b.get(x, 0) + 1
    return b


def using_setdefault(a):
    b = {}
    for x in a:
        b[x] = b.setdefault(x, 0) + 1
    return b


timeit.timeit(lambda: Counter(a), number=1000000)
Out[3]: 15.19974103783569

timeit.timeit(lambda: using_get(a), number=1000000)
Out[4]: 3.1597984457950474

timeit.timeit(lambda: using_setdefault(a), number=1000000)
Out[5]: 3.231248461129759
  • 2
    Because `b` is not defined at the time the comprehension is executed. BTW, have you tried `b = collections.Counter(a)`? – tobias_k Sep 28 '15 at 10:12
  • 1
    `dict.setdefault` is an odd way to do this anyway - you'd usually use it with a mutable value, not an immutable one (e.g. `d.setdefault[x, []].append(...)`) . I would have written that `b[x] = b.get(x, 0) + 1`. – jonrsharpe Sep 28 '15 at 10:13
  • *doesn't work*. Actually this works perfectly. If `b = {}`. initially, you will get a new dictionary (freshly assigned to `b`) with values for all keys for `a` and all values `= 0 + 1`. The original `b` is, however, gone. That's probably not what you expected, though. If `b` wasn't defined before the comprehension, you get `NameError`. – dhke Sep 28 '15 at 10:17

3 Answers3

4

There is no dictionary yet inside the dict comprehension. You are building a completely new dictionary, replacing whatever b was bound to before.

In other words, in your dictionary comprehension, b.setdefault() is a totally different dictionary, it has nothing to do with the object being built by the comprehension.

In fact, your dictionary comprehension only works if b was bound to an object with a .setdefault() method before you run the expression. If b is not yet defined, or not bound to an object with such a method, it simply fails with an exception:

>>> a = [1,1,2,2,2,3,3]
>>> b = {k: b.setdefault(k, 0) + 1 for k in a}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <dictcomp>
NameError: global name 'b' is not defined
>>> b = 42
>>> b = {k: b.setdefault(k, 0) + 1 for k in a}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <dictcomp>
AttributeError: 'int' object has no attribute 'setdefault'

You cannot do what you want with a dictionary comprehension, unless you group your numbers, which requires sorting and itertools.groupby(); this is not an efficient approach (requiring O(NlogN) steps rather than O(N)):

>>> from itertools import groupby
>>> {k: sum(1 for _ in group) for k, group in groupby(sorted(a))}
{1: 2, 2: 3, 3: 2}

Note that the standard library already comes with a tool to do counting; see the collections.Counter() object:

>>> from collections import Counter
>>> Counter(a)
Counter({2: 3, 1: 2, 3: 2})
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
2

Actually, your second snippet raises a NameError if you try it in a clean namespace (one where there's no prior definition of b):

bruno@bigb:~/Work/playground$ python
Python 2.7.3 (default, Jun 22 2015, 19:33:41) 
>>> a = [1,1,2,2,2,3,3]
>>> b = {k: b.setdefault(k, 0) + 1 for k in a}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <dictcomp>
NameError: global name 'b' is not defined

Which should give you a hint at what went wrong.

The statement:

b = {k: b.setdefault(k, 0) + 1 for k in a}

first evaluates (well, actually tries to...) the right-hand side expression {k: b.setdefault(k, 0) + 1 for k in a}, and then binds the result to name b.

If b is not defined when the expression is eval'd, you get the above exception (of course). If it's defined and bound to a dict (or whatever have a setdefault(x, y) method FWIW) you get the result of calling setdefault() on whathever b is bound to at this point.

bruno desthuilliers
  • 75,974
  • 6
  • 88
  • 118
2

This does not work because b is not defined before the dictionary comprehension is completed. Normally, you should get a NameError for this; if not, then because you already defined b before, but this will be a different dictionary.

Having said that: It seems that you can just use collections.Counter for this.

>>> a = [1,1,2,2,2,3,3]
>>> collections.Counter(a)
Counter({2: 3, 1: 2, 3: 2})
johnson
  • 3,729
  • 3
  • 31
  • 32
tobias_k
  • 81,265
  • 12
  • 120
  • 179