-1
from collections import defaultdict
phn_dictionary = {"actual": [], "predicted": []}
phn_dict = defaultdict(lambda: phn_dictionary)
phn_dict["qwe"]["actual"].extend([123,456])

phn_dict
>>>defaultdict(<function __main__.<lambda>>,
        {'qwe': {'actual': [123, 456], 'predicted': []}})

phn_dict["asd"]["actual"].extend([123,456])
phn_dict
>>>defaultdict(<function __main__.<lambda>>,
        {'asd': {'actual': [123, 456, 123, 456], 'predicted': []},
         'qwe': {'actual': [123, 456, 123, 456], 'predicted': []}})

I am running Python 3.6.4 64 bit. I need to use a defaultdict that produces phn_dictionary as its default as shown in the code above. I dont know in advance what are the the keys like "asd" and "qwe" that I will be accessing. It can be seen that in the line i extend to "asd" the "actual" key of both asd AND qwe is extended. Is this a bug or am I doing something wrong?

Usama Arif
  • 11
  • 1
  • 3
    The default here isn't "a dict that looks like `phn_dictionary`". The default is **`phn_dictionary`**. – user2357112 Mar 07 '18 at 19:32
  • Thanks, i will update the wording of the question – Usama Arif Mar 07 '18 at 19:35
  • 1
    @UsamaArif it wasn't the wording he was talking about – FHTMitchell Mar 07 '18 at 19:36
  • This is not quite the same problem as mutable shared default values, but the visible confusion is the same, so [the FAQ about those](https://docs.python.org/3/faq/programming.html#why-are-default-values-shared-between-objects) may be helpful for understanding this case. – abarnert Mar 07 '18 at 19:37
  • @UsamaArif no, you don't understand, the default **is the same dictionary** as `phn_dictionary` – juanpa.arrivillaga Mar 07 '18 at 20:01

3 Answers3

3

The problem is that lambda: phn_dictionary is a function that returns phn_dictionary—the exact same dictionary object—every time you call it. So, you end up with the same dictionary as the value for a bunch of keys. Every time you append through one key, that's visible on all other keys.

What you want is not this dictionary, but a new dictionary that starts off as a copy of that one. As Brendan Abel points out in a comment, you probably want a deep copy here—not just a new dict, but a new dict with new lists in it:

phn_dict = defaultdict(lambda: copy.deepcopy(phn_dictionary))

Or, maybe this is clearer (relying on the fact that the original lists should always be empty):

phn_dict = defaultdict(lambda: {key: [] for key in phn_dictionary})

Or, if you don't need phn_dictionary anywhere except here, just use Brendan's answer and create the dict from scratch in the function:

phn_dict = defaultdict(lambda: {"actual": [], "predicted": []})

If this is a stripped-down sample, and the real dict is much larger, or a variable, etc., obviously the last version won't work, but if this is the real code, it's the simplest.

There are other ways to solve this, some of which may be clearer, but this is the one that fits best into an inline lambda, which seems to match the way you're thinking.

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • 1
    Note that unless you're doing a deep copy, the lists in the values of the copied dictionary will still be shared. – Brendan Abel Mar 07 '18 at 19:39
  • @BrendanAbel, no, since `phn_dictionary` is empty when cloned. – Jean-François Fabre Mar 07 '18 at 19:40
  • @Jean-FrançoisFabre `php_dictionary` isn't empty it is `{"actual": [], "predicted": []}` – Brendan Abel Mar 07 '18 at 19:42
  • yes. I mean: the lists are empty. but I think your answer is clearer. No need to clone a variable that noone else uses. – Jean-François Fabre Mar 07 '18 at 19:42
  • @Jean-FrançoisFabre Yes, but the copied dictionaries are both referencing the *same* empty list. The lists aren't copied when doing a `dict.copy`. If you modify the list in on dictionary, it will affect the same list in all the other copied dictionaries. – Brendan Abel Mar 07 '18 at 19:44
  • Hi Thanks for the answers, `phn_dict = defaultdict(lambda: phn_dictionary.copy())` doesnt work, but `phn_dict = defaultdict(lambda: copy.deepcopy(phn_dictionary))` does work – Usama Arif Mar 07 '18 at 19:44
  • @Jean-FrançoisFabre Copying is fine. You can just use `copy.deepcopy()` instead to insure the dictionary values get copied as well. – Brendan Abel Mar 07 '18 at 19:46
  • Thanks @BrendanAbel. The lists are meant to be extended later on. – Usama Arif Mar 07 '18 at 19:47
  • @Jean-FrançoisFabre, i get the same results as the question says using `copy` as its a shallow copy and all the "actual" keys are extended – Usama Arif Mar 07 '18 at 19:48
  • Ohh brainfart, sorry of course you _need_ deepcopy! Brendan answer is simpler & no risks of sharing the data. – Jean-François Fabre Mar 07 '18 at 19:49
  • @Jean-FrançoisFabre: `copy` gives you a new dict, but with the same lists in it. That probably isn't what he wants here. I've edited the answer to explain that. (I'm not sure why I got a downvote _after_ fixing and explaining the problem…) – abarnert Mar 07 '18 at 19:49
  • you should remove `phn_dict = defaultdict(lambda: phn_dictionary.copy())` since it's wrong (note: I didn't downvote, I upvoted when it was wrong, silly me :)) – Jean-François Fabre Mar 07 '18 at 19:50
  • someone retracted the downvote. noone downvoted. @UsamaArif we know: you cannot upvote or downvote with 1 rep. – Jean-François Fabre Mar 07 '18 at 19:57
1

It's because they're both representing the same dictionary. If you defined the factory to return a dictionary literal, it would fix the issue

phn_dict = defaultdict(lambda: {"actual": [], "predicted": []})

This is because each time the default factory lambda is called, it returns a new dictionary instead of just returning the same dictionary over and over.

Alternatively, you could use copy.deepcopy

phn_dict = defaultdict(lambda: copy.deepcopy(phn_dictionary))

This will copy the defined dictionary and all the internal values as well.

Brendan Abel
  • 35,343
  • 14
  • 88
  • 118
0

Other answers point out the reuse of the references of the inner lists.

Unless you really want to raise a KeyError if the object is used with a wrong key, you could go with a defaultdict of a defaultdict of lists:

from collections import defaultdict
phn_dict = defaultdict(lambda: defaultdict(list))
phn_dict["qwe"]["actual"].extend([123,456])
phn_dict["qwe"]["predicted"].extend([768,333])
print(dict(phn_dict)) # clearer repr

result:

{'qwe': defaultdict(<class 'list'>, {'actual': [123, 456], 'predicted': [768, 333]})}
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219