Unwanted behaviour from dict.fromkeys

Question

I'd like to initialise a dictionary of sets (in Python 2.6) using dict.fromkeys, but the resulting structure behaves strangely. More specifically:

>>>> x = {}.fromkeys(range(10), set([]))
>>>> x
{0: set([]), 1: set([]), 2: set([]), 3: set([]), 4: set([]), 5: set([]), 6: set([]), 7: set([]), 8: set([]), 9: set([])}
>>>> x[5].add(3)
>>>> x
{0: set([3]), 1: set([3]), 2: set([3]), 3: set([3]), 4: set([3]), 5: set([3]), 6: set([3]), 7: set([3]), 8: set([3]), 9: set([3])}

I obviously don't want to add 3 to all sets, only to the set that corresponds to x[5]. Of course, I can avoid the problem by initialising x without fromkeys, but I'd like to understand what I'm missing here.

They're all the same set. Sets, lists, dictionaries and any other object are reference types, and when you assign them to another variable, only the reference is copied, not the actual object. `fromkeys` must use assignment to associate the set with each key, but as you can see, this does not copy the set. I'm not sure how to get around this, aside from creating the dictionary in a different way. — Samir Talwar, Jun 08 '10 at 19:19

score 19 · Accepted Answer · answered Jun 08 '10 at 19:23

The second argument to dict.fromkeys is just a value. You've created a dictionary that has the same set as the value for every key. Presumably you understand the way this works:

>>> a = set()
>>> b = a
>>> b.add(1)
>>> b
set([1])
>>> a
set([1])

you're seeing the same behavior there; in your case, x[0], x[1], x[2] (etc) are all different ways to access the exact same set object.

This is a bit easier to see with objects whose string representation includes their memory address, where you can see that they're identical:

>>> dict.fromkeys(range(2), object())
{0: <object object at 0x1001da080>,
 1: <object object at 0x1001da080>}

score 19 · Answer 2 · answered Jun 08 '10 at 20:30

19

You can do this with a generator expression:

x = dict( (i,set()) for i in range(10) )

In Python 3, you can use a dictionary comprehension:

x = { i : set() for i in range(10) }

In both cases, the expression set() is evaluated for each element, instead of being evaluated once and copied to each element.

answered Jun 08 '10 at 20:30

Derek Ledbetter

4,675
3
20
18

2

+1 for providing the solution even after the accepted answer explained it well. – Randy Dec 23 '13 at 22:51
If instead of sets I want to initialize lists, x = { i : [] for i in range(10) } causes SyntaxError while the dict( (i,[]) for i in range(10) ) does not. – Eduardo Feb 20 '14 at 09:57
It's worth pointing out that dictionary comprehensions work in Python 2.7 and up. – ABM Aug 12 '14 at 20:39

score 3 · Answer 3 · answered Jun 08 '10 at 19:31

Because of this from the dictobject.c:

while (_PyDict_Next(seq, &pos, &key, &oldvalue, &hash))
{
            Py_INCREF(key);
            Py_INCREF(value);
            if (insertdict(mp, key, hash, value))
                return NULL;
}

The value is your "set([])", it is evaluated only once then their result object reference count is incremented and added to the dictionary, it doesn't evaluates it every time it adds into the dict.

Francis Davey · Answer 4 · 2010-06-09T06:57:56.140

0

The reason its working this way is that set([]) creates an object (a set object). Fromkeys then uses that specific object to create all its dictionary entries. Consider:

>>> x
{0: set([]), 1: set([]), 2: set([]), 3: set([]), 4: set([]), 5: set([]), 
6: set([]), 7: set([]), 8: set([]), 9: set([])}
>>> x[0] is x[1]
True

All the sets are the same!

edited Jun 09 '10 at 06:57

answered Jun 08 '10 at 19:26

Francis Davey

740
2
9
15

1

You should really be comparing identities: `x[0] is x[1]`. – Gary Kerr Jun 08 '10 at 20:19

score 0 · Answer 5 · answered Jun 08 '10 at 20:03

0


#To do what you want:

import copy
s = set([])
x = {}
for n in range(0,5):
  x[n] = copy.deepcopy(s)
x[2].add(3)
print x

#Printing
#{0: set([]), 1: set([]), 2: set([3]), 3: set([]), 4: set([])}

answered Jun 08 '10 at 20:03

kruiser

1

2

No need for `deepcopy`. `x[n] = set()` creates a new set for each value. – Gary Kerr Jun 08 '10 at 20:18

Unwanted behaviour from dict.fromkeys

5 Answers5

Linked

Related