Regroup or reorganize keys in a dict?

Question

I have a dict inside a list that is currently like this:

[ {'name': 'Joe', 
   'score': 98,
   'subject': 'Math'},
{'name': 'Bob', 
   'score': 90,
   'subject': 'Math'},
{'name': 'Bill', 
   'score': 88,
   'subject': 'English'},
{'name': 'Jane', 
   'score': 95,
   'subject': 'English'}]

I would like to regroup or reorganize it as follows:

[ {'subject': 'Math',
  'Results': [{'name': 'Joe','score':98}, {'name':'Bob', 'score':90}]},
  {'subject': 'English',
  'Results': [{'name': 'Jane','score':95}, {'name':'Bill', 'score':88}]}
]

I tried using itertools.groupby and dict.setdefault() as suggested here, but cannot quite get what I want. How can I do this?

"cannot quite get what i want" is not very useful in a question. Provide a [MCVE] for what you tried, with example input and expected output (which can be what you provided here) and observed (unexpected) output, and we can provide assistance fixing it. In general "write my code for me" questions are frowned upon. — ShadowRanger, Jan 24 '18 at 02:42

Stephen Rauch · Answer 1 · 2018-01-24T03:01:30.147

With a small loop and dict.setdefault you can do the grouping like this:

Code:

grouped = {}
for score in scores:
    grouped.setdefault(score['subject'], []).append(
        {k: v for k, v in score.items() if k != 'subject'})

To get the other output format after grouping:

grouped = [{'subject': k, 'Results': v} for k, v in grouped.items()]

Test Code:

scores = [
    {'name': 'Joe',
       'score': 98,
       'subject': 'Math'},
    {'name': 'Bob',
       'score': 90,
       'subject': 'Math'},
    {'name': 'Bill',
       'score': 88,
       'subject': 'English'},
    {'name': 'Jane',
       'score': 95,
       'subject': 'English'}]

grouped = {}
for score in scores:
    grouped.setdefault(score['subject'], []).append({
        k: v for k, v in score.items() if k != 'subject'})

print([{'subject': k, 'Results': v} for k, v in grouped.items()])

Results:

[
    {'subject': 'Math', 
     'Results': [{'name': 'Joe', 'score': 98}, {'name': 'Bob', 'score': 90}]}, 
    {'subject': 'English', 
     'Results': [{'name': 'Bill', 'score': 88}, {'name': 'Jane', 'score': 95}]}
]

Note: If performance matters, `setdefault` with a non-constant-literal default can get kinda wasteful on large inputs (in this case it has to create the default new empty `list`s on every call, whether they're needed or not). Making `grouped = collections.defaultdict(list)`, then doing `grouped[score['subject']].append(...)` would be faster/cleaner (`defaultdict` lazily creates the default value only when the key requested doesn't exist); if you wanted to remove the defaulting behavior after, just do `grouped = dict(grouped)` at the end to convert back. — ShadowRanger, Jan 24 '18 at 02:52

score 2 · Answer 2 · answered Jan 24 '18 at 03:02

Take a look at itertools.groupby, then the following code maybe help you.

[{'subject': k, 'Results': list(g)} for k, g in itertools.groupby(a, key=itemgetter('subject'))]

Sample Output:

[{'Results': [{'score': 98, 'name': 'Joe', 'subject': 'Math'}, {'score': 90, 'name': 'Bob', 'subject': 'Math'}], 'subject': 'Math'}, {'Results': [{'score': 88, 'name': 'Bill', 'subject': 'English'}, {'score': 95, 'name': 'Jane', 'subject': 'English'}], 'subject': 'English'}]

This is close to the desired result. Extra key items still appear in the resulting dicts. — pylang, Jan 24 '18 at 05:10

score 0 · Answer 3 · answered Jan 24 '18 at 02:57

You will need to iterate through the old list and reformat each element into the new one

#first we need to create the newList in the general format that you want

newList = [{'subject':'math','results':[]},{'subject':'english','results':[]}]

#then we iterate through the elements in the old list and put them into the new list with the new formatting

for i in oldList:

    element = 0 if i['subject']=='math' else 'english' #because, in your post, you ordered them this way

    #then we need to append the element to the results list

    newList[element]['results'].append({'name':i['name'],'score':i['score']})

score 0 · Answer 4 · answered Jan 24 '18 at 02:58

I like this kind of syntax when dealing with custom objects derived from some dictionary data:

o = [ {'name': 'Joe', 
   'score': 98,
   'subject': 'Math'},
{'name': 'Bob', 
   'score': 90,
   'subject': 'Math'},
{'name': 'Bill', 
   'score': 88,
   'subject': 'English'},
{'name': 'Jane', 
   'score': 95,
   'subject': 'English'}]

r = []
for a in set([b['subject'] for b in o]):
  r.append({
      'subject': a, 
      'Results': [{'name':c['name'], 'score':c['score']} for c in o if c['subject']==a ],
  })

print(r)

Working code: repl.it

score 0 · Answer 5 · answered Jan 24 '18 at 03:32

If you want to use a collections.defaultdict(), you can do this:

from collections import defaultdict
from pprint import pprint

scores = [{'name': 'Joe', 
           'score': 98,
           'subject': 'Math'},
          {'name': 'Bob', 
           'score': 90,
           'subject': 'Math'},
          {'name': 'Bill', 
           'score': 88,
           'subject': 'English'},
          {'name': 'Jane', 
           'score': 95,
           'subject': 'English'}]

result = defaultdict(list)
for score in scores:
    temp = {k: _ for k, _ in score.items() if k != 'subject'}
    result[score['subject']].append(temp)

pprint([{'subject' : k, 'Results': v} for k, v in result.items()])

Which gives:

[{'Results': [{'name': 'Joe', 'score': 98}, {'name': 'Bob', 'score': 90}],
  'subject': 'Math'},
 {'Results': [{'name': 'Bill', 'score': 88}, {'name': 'Jane', 'score': 95}],
  'subject': 'English'}]

pylang · Answer 6 · 2018-01-24T05:40:14.103

Option 1

Here is a standard itertools.groupby approach:

key = "subject"
[{key: k, "Result": {k_: v for d in g for k_, v in d.items() if k_ != key}} for k, g in it.groupby(lst, lambda x: x[key])]

For simplicity, if given the form [k: g for k, g in itertools.groupby(iterable, key)], here g simply is substituted with a filtered dictionary comprehension. lst is the input list of dicts.

Option 2

more_itertools.groupby_transform is a third-party recipe that extends itertools.groupby to allow changes to the resulting groups:

import copy

import more_itertools as mit


def get_scores(iterable, key):
    """Return resulting ditctionaries grouped by key."""
    iterable = copy.deepcopy(iterable)                            # optional
    kfunc = lambda x: x[key]
    def vfunc(x):
        del x[key]
        return x
    return [{key: k, "Result": list(g)} for k, g in mit.groupby_transform(iterable, keyfunc=kfunc, valuefunc=vfunc)]


get_scores(lst, "subject")

Here duplicate keys are deleted from the resulting groups. Deleting items will mutate the nested dictionaries. To preserve some level of the former nested dicts, make deepcopies, i.e.g. see the optional line.

score 0 · Answer 7 · answered Jan 25 '18 at 05:03

in one line you can do something like this:

data=[ {'name': 'Joe',
   'score': 98,
   'subject': 'Math'},
{'name': 'Bob',
   'score': 90,
   'subject': 'Math'},
{'name': 'Bill',
   'score': 88,
   'subject': 'English'},
{'name': 'Jane',
   'score': 95,
   'subject': 'English'}]

import itertools

print({i:list(j) for i,j in itertools.groupby(data,key=lambda x:x['subject'])})

output:

{'English': [{'subject': 'English', 'score': 88, 'name': 'Bill'}, {'subject': 'English', 'score': 95, 'name': 'Jane'}], 'Math': [{'subject': 'Math', 'score': 98, 'name': 'Joe'}, {'subject': 'Math', 'score': 90, 'name': 'Bob'}]}

Regroup or reorganize keys in a dict?

7 Answers7

Code:

Test Code:

Results: