From a dictionary with repeated values, how to create a new one excluding the repeats and incrementing a counter inside the dictionary?

Question

Turn this:

a = {'1': {'name': 'Blue', 'qty': '1'},
     '2': {'name': 'Green', 'qty': '1'},
     '3': {'name': 'Blue', 'qty': '1'},
     '4': {'name': 'Blue', 'qty': '1'}}

into this:

b = {'1': {'name': 'Blue', 'qty': '3'},
     '2': {'name': 'Green', 'qty': '1'}}

I was able to exclude the repeated values but could't increment the 'qty' field.

b = {}

for k,v in a.iteritems():
    if v not in b.values():
        b[k] = v

possible duplicate of [how to count the repetition of the elements in a list python, django](http://stackoverflow.com/questions/18548710/how-to-count-the-repetition-of-the-elements-in-a-list-python-django) — ivan_pozdeev, Jan 20 '15 at 11:59
That's a funny looking data structure you've got there. Indexes starting from 1, as strings, as keys. :) You probably have your reasons, but still! — André Laszlo, Jan 20 '15 at 12:00
Don't mind much about the indexes. The result could also be a list. — Saulo Mendes, Jan 20 '15 at 12:10
If you can alter the format of `a`, I'd certainly recommend removing the redundant index. If not - @AndréLaszlo's answer (http://stackoverflow.com/a/28045057/838992) is really nice. It gets `b` into your format, but `result` is what you want without the redundant index. — J Richard Snape, Jan 20 '15 at 12:14
You might be able to use `collections.Counter` from the standard library. https://docs.python.org/2/library/collections.html#collections.Counter However, your data structure is not suited for the standard tools. If you had a list `["Blue", "Blue", "Green", "Green", "Blue"]` or a list of tuples `[("Blue", 1), ("Green", 1), ("Blue", 2), ("Green", 1)]`, it would be very easy to use Counter. — Håken Lid, Jan 20 '15 at 12:16

André Laszlo · Accepted Answer · 2015-01-20T12:42:36.447

4

This seems to work:

from collections import defaultdict

result = defaultdict(lambda: 0)

# Summarize quantities for each name
for item in a.values():
    result[item['name']] += int(item['qty'])

# Convert to your funny format
b = {str(i+1): v for i, v in enumerate({'name': key, 'qty': str(val)} for key, val in result.items())}

# b contains:
# {'1': {'name': 'Blue', 'qty': '3'}, '2': {'name': 'Green', 'qty': '1'}}

If I could choose data structures, it might look like this:

from operator import add
from collections import Counter

a = [('Blue', 1), ('Green', 1), ('Blue', 1), ('Blue', 1)]
b = reduce(add, [Counter(**{x[0]: x[1]}) for x in a])
# b contains:
# Counter({'Blue': 3, 'Green': 1})

edited Jan 20 '15 at 12:42

answered Jan 20 '15 at 12:09

André Laszlo

15,169
3
63
81

1

This answers the question! Even with the weird indexes. Thank you André! – Saulo Mendes Jan 20 '15 at 12:16
I hope you could still help. Each key can have more values other than 'name' and 'qty'. It should be counted as a duplicate only if all values are equal. turn this: a = {'1': {'name': 'Blue', 'qty': '1', 'sub': ['sky', 'ethernet cable']}, '2': {'name': 'Blue', 'qty': '1', 'sub': ['sky', 'ethernet cable']}, '3': {'name': 'Green', 'qty': '1', 'sub': []}, '4': {'name': 'Blue', 'qty': '1', 'sub': ['sea']}} into this: b = {'1': {'name': 'Blue', 'qty': '2', 'sub': ['sky', 'ethernet cable']}, '2': {'name': 'Green', 'qty': '1', 'sub': []}, '3': {'name': 'Blue', 'qty': '1', 'sub': ['sea']}} – Saulo Mendes Jan 20 '15 at 13:16
I'll get back to you on that :-) – André Laszlo Jan 20 '15 at 13:17
It gets a little bit more complicated since you want to do a deep comparison. Do you mind posting a new question that expands on your exact needs? – André Laszlo Jan 20 '15 at 15:04
New question posted: http://stackoverflow.com/questions/28049316/exclude-repeated-values-from-a-dictionary-and-increment-the-qty-field-accordin – Saulo Mendes Jan 20 '15 at 15:38

FuzzyDuck · Answer 2 · 2015-01-20T14:56:47.317

2

A cumbersome two-liner:

data = [v['name'] for v in a.values()]

b = {str(i+1): {'name': j, 'qty': data.count(j)} for i, j in enumerate(set(data))}

Following comments from André and the original poster, here is an even more complicated solution.

First, convert the original dict 'name' and 'sub' keys to a comma-delimited string, so we can use set():

data = [','.join([v['name']]+v['sub']) for v in a.values()]

This returns

['Blue,sky,ethernet cable', 'Green', 'Blue,sky,ethernet cable', 'Blue,sea']

Then use the nested dict and list comprehensions as below:

b = {str(i+1): {'name': j.split(',')[0], 'qty': sum([int(qty['qty']) for qty in a.values() if (qty['name']==j.split(',')[0]) and (qty['sub']==j.split(',')[1:])]), 'sub': j.split(',')[1:]} for i, j in enumerate(set(data))}

Hope this helps.

edited Jan 20 '15 at 14:56

answered Jan 20 '15 at 12:16

FuzzyDuck

1,492
12
14

Nice! But what if `'qty'` is not `'1'`? – André Laszlo Jan 20 '15 at 12:28
Good point! Even more cumbersome, but this seems to work: b = {str(i+1): {'name': j, 'qty': sum([int(qty['qty']) for qty in a.values() if qty['name']==j])} for i, j in enumerate(set(data))} – FuzzyDuck Jan 20 '15 at 12:38
Ridiculous number of edits - trying to work out to get the mini-Markdown code formatting to work! :-) – FuzzyDuck Jan 20 '15 at 12:41
Funny, I've been coding Python for years but I have managed to miss dict comprehensions somehow. Thanks! Updated my answer too. – André Laszlo Jan 20 '15 at 12:44
Thank you FuzzyDuck, it works perfectly. Now I actually forgot to say that everything under a key should be equal, not just the 'name'. The actual dictionary can have a 'sub' key which contains a list of items. These items should also be considered for defining if a key has duplicates. If you will, please see the comment on André's answer. Thanks again :) – Saulo Mendes Jan 20 '15 at 13:57
You're welcome! I would add to Andre's point that the data structure is not ideal, hence the fact that the dict comprehensions are getting ever more unwieldy. I've modified my answer above with something that seems to work. – FuzzyDuck Jan 20 '15 at 14:50

From a dictionary with repeated values, how to create a new one excluding the repeats and incrementing a counter inside the dictionary?

Turn this:

into this:

2 Answers2

Linked