You can use a collections.defaultdict
to group by the first item of each sublist:
from collections import defaultdict
list1 = [['P', '2.1'], ['C', '4.1'], ['S', '4.2'], ['M', '5.3'], ['A', '4.1'], ['C', '3.8'], ['C', '3.9'], ['M', '5.2'], ['A', '4.3'], ['P', '2.3'], ['C', '4.2'], ['A', '4.4']]
d = defaultdict(list)
for x, y in list1:
d[x].append(float(y))
print(dict(d))
# {'P': [2.1, 2.3], 'C': [4.1, 3.8, 3.9, 4.2], 'S': [4.2], 'M': [5.3, 5.2], 'A': [4.1, 4.3, 4.4]}
print({k: sum(map(float, v)) for k, v in d.items()})
# {'P': 4.4, 'C': 16.0, 'S': 4.2, 'M': 10.5, 'A': 12.799999999999999}
If we are only interested in summing the floats, using a defaultdict(float)
is probably more faster than collecting all the floats in a list then applying sum()
:
d = defaultdict(float)
for x, y in list1:
d[x] += float(y)
print(dict(d))
# {'P': 4.4, 'C': 16.0, 'S': 4.2, 'M': 10.5, 'A': 12.799999999999999}
Also note that defaultdict
is a subclass of dict
, so the dict()
cast is not needed.
You could also sort by the first item then apply itertools.groupby
:
from itertools import groupby
list1 = [['P', '2.1'], ['C', '4.1'], ['S', '4.2'], ['M', '5.3'], ['A', '4.1'], ['C', '3.8'], ['C', '3.9'], ['M', '5.2'], ['A', '4.3'], ['P', '2.3'], ['C', '4.2'], ['A', '4.4']]
print({k: [float(x[1]) for x in g] for k, g in groupby(sorted(list1), key=lambda x: x[0])})
# {'A': [4.1, 4.3, 4.4], 'C': [3.8, 3.9, 4.1, 4.2], 'M': [5.2, 5.3], 'P': [2.1, 2.3], 'S': [4.2]}
print({k: sum(float(x[1]) for x in g) for k, g in groupby(sorted(list1), key=lambda x: x[0])})
# {'A': 12.799999999999999, 'C': 16.0, 'M': 10.5, 'P': 4.4, 'S': 4.2}
Both solutions above show how to group and get the sum of the floats. The groupby
solution is slower for grouping because it is using O(NLogN)
sorting, whereas the defaultdict
groups in O(N)
time. We could also replace key=lambda x: x[0]
with operator.itemgetter(0)
, which is slightly faster. More information about the speed difference between the two in this answer.