0

I have a nested dictionary that looks like this:

{
1: {'Name': {'John'}, 'Answer': {'yes'}, 'Country': {'USA'}}, 
2: {'Name': {'Julia'}, 'Answer': {'no'}, 'Country': {'Hong Kong'}}
3: {'Name': {'Adam'}, 'Answer': {'yes'}, 'Country': {'Hong Kong'}}
}

I now need to get the occurrence of each country and the number of people who answered yes or no. Currently, I only collect the number of occurrences in each country:

nationalities = ['USA', 'Hong Kong', 'France' ...]
for countries in nationalities:
    cnt =[item for l in [v2 for v1 in dictionary1.values() for v2 in v1.values()] for item in l].count(countries)
    result.append(countries + ': ' + str(cnt))

so using my datasheet I get something like

['Hong Kong: 2', 'France: 2', 'Italy: 3']

However, I would like to get the proportion of the people who answered yes and who answered no. Such that I get a list in the form of ['Hong Kong: 2 1 1'] where the first number would be total and the second and third would be yes and no respectively

Thanks for any help

Sam333
  • 199
  • 14
  • `result = Counter(chain.from_iterable(map(itemgetter('Country'), dictionary1.values())))` or `result = Counter(i for v in dictionary1.values() for i in v['Country'])` Imports: [`Counter`](https://docs.python.org/3/library/collections.html#collections.Counter), [`itemgetter()`](https://docs.python.org/3/library/operator.html#operator.itemgetter), [`chain.from_iterable()`](https://docs.python.org/3/library/itertools.html#itertools.chain.from_iterable). – Olvin Roght Jan 03 '21 at 23:25

3 Answers3

1

Here's a possible solution using a defaultdict to generate a dictionary of results by summing how many answers equal either yes or no for each country:

from collections import defaultdict

dictionary1 = {
1: {'Name': {'John'}, 'Answer': {'yes'}, 'Country': {'USA'}}, 
2: {'Name': {'Julia'}, 'Answer': {'no'}, 'Country': {'Hong Kong'}},
3: {'Name': {'Adam'}, 'Answer': {'yes'}, 'Country': {'Hong Kong'}}
}

nationalities = ['USA', 'Hong Kong', 'France']
result = defaultdict(list)
for countries in nationalities:
    [yes, no] = [sum(list(d['Answer'])[0] == answer and list(d['Country'])[0] == countries for d in dictionary1.values()) for answer in ['yes', 'no']]
    result[countries] = [ yes+no, yes, no ]
    
print(dict(result))

For your sample data, this gives

{
 'USA': [1, 1, 0],
 'Hong Kong': [2, 1, 1],
 'France': [0, 0, 0]
}

You can then convert that into a list of strings by

result = [ f"{key}: {' '.join(map(str, counts))}" for key, counts in result.items()]

which gives:

['USA: 1 1 0', 'Hong Kong: 2 1 1', 'France: 0 0 0']
Nick
  • 138,499
  • 22
  • 57
  • 95
  • 1
    If it's not too late to change, I would consider changing the values for each key to simple strings rather than the sets you currently have. It will simplify and speed up your code. – Nick Jan 03 '21 at 23:50
0
a={
1: {'Name': {'John'}, 'Answer': {'yes'}, 'Country': {'USA'}}, 
2: {'Name': {'Julia'}, 'Answer': {'no'}, 'Country': {'Hong Kong'}},
3: {'Name': {'Adam'}, 'Answer': {'yes'}, 'Country': {'Hong Kong'}}
}
results=[]
nationalities = ['USA', 'Hong Kong', 'France']
for country in nationalities:
    countryyes=0
    countryno=0
    for row in a.values():
        if str(row['Country'])[2:-2] == country:
            if str(row['Answer'])[2:-2] == 'yes':
                countryyes+=1
            if str(row['Answer'])[2:-2] == 'no':
                countryno+=1
    results.append(country+': '+str(countryyes+countryno)+' '+str(countryyes)+' '+str(countryno))

I want to make a couple notes. First, I changed countries to country (it's abnormal to use a plural name in a for loop like that). Second, I wanted to comment and say that if your code above you have the name, answer, and country each in a set and I think you would be better off changing that to just having it as a string.

AndrewGraham
  • 310
  • 1
  • 8
  • Thank you so much! Exactly what I was looking for! I just now realized that I used the set for the values, (first time using dictionaries), but now all my code is adapted to that, so I might just stick with it. Thanks again! – Sam333 Jan 03 '21 at 23:46
  • It's madness to parse value from string representation of set, -1 – Olvin Roght Jan 03 '21 at 23:58
  • Olvin I'm unfamiliar with sets and couldn't quickly figure out how to convert {'USA'} to USA. I would appreciate if you would tell me how to do it without slicing. – AndrewGraham Jan 04 '21 at 00:29
  • @AndrewGraham, there're plenty of methods, shortest will be `country in row['Country']`. – Olvin Roght Jan 04 '21 at 00:38
  • Thanks anyway Olvin but I think we had some miscommunication. that doesn't prevent the need for the slicing to get from {'USA'} to USA; unless I don't understand your solution. – AndrewGraham Jan 04 '21 at 00:47
  • @AndrewGraham, it'll be good to read official [python tutorial](https://docs.python.org/3/tutorial/) and [language reference](https://docs.python.org/3/reference/) before answering questions. You can find an explanation of expression I've provided in [6.10.2. Membership test operations](https://docs.python.org/3/reference/expressions.html#membership-test-operations) and [5.4. Sets](https://docs.python.org/3/tutorial/datastructures.html#sets). – Olvin Roght Jan 04 '21 at 08:37
0

I would use Counter to count answers and groupby() to group entries by country:

from collections import Counter
from operator import itemgetter
from itertools import groupby

dictionary1 = {...}  # input data
group_func = itemgetter('Country')
result = []
for (country, *_), items in groupby(sorted(dictionary1.values(), key=group_func), group_func):
    answers = Counter(answer.lower() for i in items for answer in i['Answer'])
    result.append(f'{country} {sum(answers.values())} {answers.get("yes", 0)} {answers.get("no", 0)}')
Olvin Roght
  • 7,677
  • 2
  • 16
  • 35