Python - Group a list of dicts by key value, count separate key value as dict?

Question

I'm trying to group up the name key values here as a key for a dict value, and count the source value as a key for said parent key, and have the count value with as its value.

data = [
{'name':'Gill', 'source':'foo'},
{'name':'Gill', 'source':'foo'},
{'name':'Gill', 'source':'foo'},
{'name':'Gill', 'source':'bar'},
{'name':'Gill', 'source':'bar'},
{'name':'Gill', 'source':'bar'},
{'name':'Gill', 'source':'bar'},
{'name':'Gill', 'source':'bar'},
{'name':'Dave', 'source':'foo'},
{'name':'Dave', 'source':'foo'},
{'name':'Dave', 'source':'foo'},
{'name':'Dave', 'source':'foo'},
{'name':'Dave', 'source':'egg'},
{'name':'Dave', 'source':'egg'},
{'name':'Dave', 'source':'egg'},
{'name':'Dave', 'source':'egg'},
{'name':'Dave', 'source':'egg'},
{'name':'Dave', 'source':'egg'},
{'name':'Dave', 'source':'egg'}
]

How do I achieve the below output?

{'Gill': {'foo':3, 'bar':5}, 'Dave': {'foo':4, 'egg':7}}

I think it may be possible with a 1 liner...

Have you tried searching this site first? – fukanchik Sep 21 '17 at 15:55 — fukanchik, Sep 21 '17 at 15:55

score 12 · Accepted Answer · edited Jan 17 '19 at 02:26

12

Use itertools.groupby to group by names, then collections.Counter to count the source categories belonging to each name:

from collections import Counter
from itertools import groupby

f = lambda x: x['name']
dct = {k: Counter(d['source'] for d in g) for k, g in groupby(data, f)}
print(dct)
# {'Gill': Counter({'bar': 5, 'foo': 3}), 'Dave': Counter({'egg': 7, 'foo': 4})}

edited Jan 17 '19 at 02:26

Selcuk

57,004
12
102
110

answered Sep 21 '17 at 15:53

Moses Koledoye

77,341
8
133
139

2

Of course, this assumes that the data is sorted by the `'name'` key. – vaultah Sep 21 '17 at 16:04
Thank you, this is great. However for my actual dataset, it has a lot more keys than 'name' and 'source' which I haven't mentioned here (I thought it'd be fine), I may need to strip it down to just the two. But the groupby(data, f) seems to create problems with it, is there a way to make this work if a 3rd key was introduced, but have it disregard said key? (I am being picky) – Slopax Sep 21 '17 at 16:10
@Slopax I don't see how a third key would create a problem if you don't actually need it. – Moses Koledoye Sep 21 '17 at 16:26
@MosesKoledoye I am mistaken, having just 2 keys as shown here in my example seems to produce different results, very strange. That's a headache for tomorrow! – Slopax Sep 21 '17 at 16:28

score 1 · Answer 2 · answered Sep 21 '17 at 17:35

This is obviously not a one-liner, but is simple and pretty straight forward. Would work for any number of values.

results = {}
key = 'name'
for line in data:
    tracked_key = line[key]
    results.setdefault(tracked_key, {})
    for k, v in line.iteritems():
        if k == key:
            continue
        results[tracked_key].setdefault(v, 0)
        results[tracked_key][v] += 1

Python - Group a list of dicts by key value, count separate key value as dict?

2 Answers2