0

create a dictionary in a list of dictionaries

How do I group this list of dicts by the same month?. Tried to implement the answer from this link but no luck. would appreciate help.

Here's the list of dictionary format, I have

[
{'date':'2020-02-02','id' : '1','dept': '20020','CNT' : '1','rep_level' : 'form1'},
{'date':'2020-02-02','id' : '1','dept': '20020','CNT' : '0','rep_level' : 'form2'},
{'date':'2020-02-02','id' : '1','dept': '20020','CNT' : '4','rep_level' : 'form3'},
{'date':'2020-02-02','id' : '2','dept': '20020','CNT' : '9','rep_level' : 'all'},
{'date':'2020-02-02','id' : '3','dept': '20021','CNT' : '14','rep_level' : 'all'},
{'date':'2020-02-02','id' : '1','dept': '20022','CNT' : '5','rep_level' : 'form1'},
{'date':'2020-02-02','id' : '1','dept': '20022','CNT' : '2','rep_level' : 'form2'},
{'date':'2020-02-02','id' : '1','dept': '20022','CNT' : '3','rep_level' : 'form3'}
]

answer format:

[
{"dept":"20020", "date":"2020-02-02", "answers":[{"id":"1", "answerValue":[1,0,4]},{"id":"2", answer:9}]},
{"dept":"20021", "date":"2020-02-02", "answers":[{"id":"3", "answerValue":14}]},
{"dept":"20022", "date":"2020-02-02", "answers":[{"id":"1", "answerValue":[5,2,3]}]}
]

Thanks,

Raj
  • 585
  • 4
  • 16
  • 28

1 Answers1

2

The solution provided in the answer you linked is correct, but you have to put it all together in a specific way to get the result you're after:

from itertools import groupby

data = [
    {'date': '2020-02-02', 'id': '1', 'dept': '20020', 'CNT': '1', 'rep_level': 'form1'},
    {'date': '2020-02-02', 'id': '1', 'dept': '20020', 'CNT': '0', 'rep_level': 'form2'},
    {'date': '2020-02-02', 'id': '1', 'dept': '20020', 'CNT': '4', 'rep_level': 'form3'},
    {'date': '2020-02-02', 'id': '2', 'dept': '20020', 'CNT': '9', 'rep_level': 'all'},
    {'date': '2020-02-02', 'id': '3', 'dept': '20021', 'CNT': '14', 'rep_level': 'all'},
    {'date': '2020-02-02', 'id': '1', 'dept': '20022', 'CNT': '5', 'rep_level': 'form1'},
    {'date': '2020-02-02', 'id': '1', 'dept': '20022', 'CNT': '2', 'rep_level': 'form2'},
    {'date': '2020-02-02', 'id': '1', 'dept': '20022', 'CNT': '3', 'rep_level': 'form3'}
]

result = [{
    'dept': dept,
    'answers': [{
        'id': identifier,
        'answerValue': [int(a['CNT']) for a in answers]
    } for identifier, answers in groupby(results, key=lambda x: x['id'])]
} for dept, results in groupby(data, key=lambda x: x['dept'])]

On the inside, there's:

        'answerValue': [int(a['CNT']) for a in answers]

Which constructs a list of the answer integer values from string values for 'CNT' in answers, as a list comprehension.

That answers comes from the expression around it:

    'answers': [{
        'id': identifier,
        'answerValue': [int(a['CNT']) for a in answers]
    } for identifier, answers in groupby(results, key=lambda x: x['id'])]

This is another list comprehension, creating one dictionary for each value of identifier and the answers that come with it, after a call to groupby(), grouping results on the 'id' field.

And that results comes from the outer comprehension:

result = [{
    'dept': dept,
    'answers': [{
        'id': identifier,
        'answerValue': [int(a['CNT']) for a in answers]
    } for identifier, answers in groupby(results, key=lambda x: x['id'])]
} for dept, results in groupby(data, key=lambda x: x['dept'])]

This is similar to the previous, grouping the original data by 'dept' and creating one dictionary for each department and the results grouped for it.

If you print(result):

[{'dept': '20020', 'answers': [{'id': '1', 'answerValue': [1, 0, 4]}, {'id': '2', 'answerValue': [9]}]}, {'dept': '20021', 'answers': [{'id': '3', 'answerValue': [14]}]}, {'dept': '20022', 'answers': [{'id': '1', 'answerValue': [5, 2, 3]}]}]

Which is the result you were after. You could of course add the date, if you wanted to, but you indicated this is always the same anyway.

Note: personally, I think this is a more useful way of doing something similar:

result = {
    dept: {
        identifier: [int(a['CNT']) for a in answers]
        for identifier, answers in groupby(results, key=lambda x: x['id'])
    }
    for dept, results in groupby(data, key=lambda x: x['dept'])
}

This gets you (when printed):

{'20020': {'1': [1, 0, 4], '2': [9]}, '20021': {'3': [14]}, '20022': {'1': [5, 2, 3]}}

And you could access that like this:

print(result['20020']['2'])  # prints "[9]"
Grismar
  • 27,561
  • 4
  • 31
  • 54
  • thanks for the answer. For single value, I dont want the integer to be in an array. for e.g) I want it to be {'20020': {'1': [1, 0, 4], '2': 9}} instead of {'20020': {'1': [1, 0, 4], '2': [9]}}. May I know how do I do it? – Raj Feb 08 '21 at 05:43
  • I'd recommend against doing that, since you'll end up writing extra code that has to check whether to find a list or a single integer as a value. As a programmer, don't look at data like this as a human, look at it like a computer. In this case, it'd be tricky as the object returned by `groupby` doesn't make it very easy to see how many items are in it, since it is an `iter()` on the inside. Easiest would be to replace the single elements after the fact, but that's a lot of extra code to make the result arguably worse. – Grismar Feb 08 '21 at 06:04