Add values of repeated items from list to a dictionary

Question

I have the list

list1 = [['P', '2.1'], ['C', '4.1'], ['S', '4.2'], ['M', '5.3'], ['A', '4.1'], ['C', '3.8'], ['C', '3.9'], ['M', '5.2'], ['A', '4.3'], ['P', '2.3'], ['C', '4.2'], ['A', '4.4']]

and i have to create a dictionary adding the values of repeated items.

so far I have this:

for line in carsList:
     model = line[0]
     carsDict.update({model:[]})
     if model in carsDict:
         carsDict[model].append(float(line[1]))

But when I print what i got is: {'P': [2.3], 'C': [4.2], 'S': [4.2], 'M': [5.2], 'A': [4.4]}

Thanks.

1

What is your expected output? – Nick Apr 08 '20 at 07:41

Nemo Zhang · Answer 1 · 2020-04-08T06:30:09.467

You cleared the list every time by updating it to an empty list, so you only gets the last element.

for line in carsList:
     model = line[0]
     carsDict.update({model:[]}) # It clears out your list every time!
     if model in carsDict:
         carsDict[model].append(float(line[1]))

This is what you should do. You create a new list when it does not exist, and append to it if there is already a list.

for line in carsList:
    model = line[0]
    if not model in carsDict:
        carsDict[model]=[]
    carsDict[model].append(float(line[1]))
# {'A': [4.1, 4.3, 4.4], 'P': [2.1, 2.3], 'C': [4.1, 3.8, 3.9, 4.2], 'M': [5.3, 5.2], 'S': [4.2]}

If you want to calculate a sum of all the floats:

for line in carsList:
    model = line[0]
    if not model in carsDict:
        carsDict[model]=0
    carsDict[model]+=float(line[1])
# {'A': 12.8, 'P': 4.4, 'C': 16.0, 'M': 10.5, 'S': 4.2}

score 1 · Answer 2 · answered Apr 08 '20 at 06:28

Use dict.setdefault

Ex:

list1 = [['P', '2.1'], ['C', '4.1'], ['S', '4.2'], ['M', '5.3'], ['A', '4.1'], ['C', '3.8'], ['C', '3.9'], ['M', '5.2'], ['A', '4.3'], ['P', '2.3'], ['C', '4.2'], ['A', '4.4']]
result = {}

for k, v in list1:
    result.setdefault(k, []).append(v)

print(result)

or collections.defaultdict

from collections import defaultdict
result = defaultdict(list)

Output:

{'A': ['4.1', '4.3', '4.4'],
 'C': ['4.1', '3.8', '3.9', '4.2'],
 'M': ['5.3', '5.2'],
 'P': ['2.1', '2.3'],
 'S': ['4.2']}

score 1 · Answer 3 · answered Apr 08 '20 at 06:29

Every time you run through the for loop, you reset the value of model to an empty list in your carsDict.

what you can do is:

for line in carsList:
    model = line[0]
    if model in carsDict:
        carsDict[model].append(float(line[1]))
    else:
        carsDict[model] = [float(line[1])]

RoadRunner · Answer 4 · 2020-04-08T07:49:55.617

You can use a collections.defaultdict to group by the first item of each sublist:

from collections import defaultdict

list1 = [['P', '2.1'], ['C', '4.1'], ['S', '4.2'], ['M', '5.3'], ['A', '4.1'], ['C', '3.8'], ['C', '3.9'], ['M', '5.2'], ['A', '4.3'], ['P', '2.3'], ['C', '4.2'], ['A', '4.4']]

d = defaultdict(list)
for x, y in list1:
    d[x].append(float(y))

print(dict(d))
# {'P': [2.1, 2.3], 'C': [4.1, 3.8, 3.9, 4.2], 'S': [4.2], 'M': [5.3, 5.2], 'A': [4.1, 4.3, 4.4]}

print({k: sum(map(float, v)) for k, v in d.items()})
# {'P': 4.4, 'C': 16.0, 'S': 4.2, 'M': 10.5, 'A': 12.799999999999999}

If we are only interested in summing the floats, using a defaultdict(float) is probably more faster than collecting all the floats in a list then applying sum():

d = defaultdict(float)
for x, y in list1:
    d[x] += float(y)

print(dict(d))
# {'P': 4.4, 'C': 16.0, 'S': 4.2, 'M': 10.5, 'A': 12.799999999999999}

Also note that defaultdict is a subclass of dict, so the dict() cast is not needed.

You could also sort by the first item then apply itertools.groupby:

from itertools import groupby

list1 = [['P', '2.1'], ['C', '4.1'], ['S', '4.2'], ['M', '5.3'], ['A', '4.1'], ['C', '3.8'], ['C', '3.9'], ['M', '5.2'], ['A', '4.3'], ['P', '2.3'], ['C', '4.2'], ['A', '4.4']]

print({k: [float(x[1]) for x in g] for k, g in groupby(sorted(list1), key=lambda x: x[0])})
# {'A': [4.1, 4.3, 4.4], 'C': [3.8, 3.9, 4.1, 4.2], 'M': [5.2, 5.3], 'P': [2.1, 2.3], 'S': [4.2]}

print({k: sum(float(x[1]) for x in g) for k, g in groupby(sorted(list1), key=lambda x: x[0])})
# {'A': 12.799999999999999, 'C': 16.0, 'M': 10.5, 'P': 4.4, 'S': 4.2}

Both solutions above show how to group and get the sum of the floats. The groupby solution is slower for grouping because it is using O(NLogN) sorting, whereas the defaultdict groups in O(N) time. We could also replace key=lambda x: x[0] with operator.itemgetter(0), which is slightly faster. More information about the speed difference between the two in this answer.

Nick · Accepted Answer · 2020-04-08T06:45:38.947

1

I've interpreted your question as meaning you want to sum the values for each key. There's a couple of ways you can achieve this. Using a defaultdict is probably the simplest:

from collections import defaultdict

carsDict = defaultdict(float)

for l in list1:
    carsDict[l[0]] += float(l[1])

print({ k : v for k, v in carsDict.items() })

but you can also implement it as a dictionary comprehension:

carsDict = { k : sum(float(l[1]) for l in list1 if l[0] == k) for k in set(l[0] for l in list1) }
print(carsDict)

In both cases the output is

{'C': 16.0, 'M': 10.5, 'S': 4.2, 'P': 4.4, 'A': 12.799999999999999}

edited Apr 08 '20 at 06:45

answered Apr 08 '20 at 06:33

Nick

138,499
22
57
95

Yeah I also recommended `defaultdict(float)`. +1 – RoadRunner Apr 08 '20 at 07:44
1

I have asked OP to clarify the expected result as this entire answer could be wrong. Your answer is the best here though (it got my vote anyway) as it covers both bases. – Nick Apr 08 '20 at 07:45
Yeah I was unsure either. Thought it wouldn't hurt to include both cases. Hopefully the OP can clarify :) – RoadRunner Apr 08 '20 at 07:48

Add values of repeated items from list to a dictionary

5 Answers5