3

first of all this question is related but not solving my issue Python sum dict values based on keys

I have a DICT like this

{
...
"httpXYZ_ACTION1": [10, 0],
"http123_ITEM1": [0.055, 0.0875],
"http456_ACTION1": [0.01824, 0.066667],
"httpABC_ITEM2": [1214.666667, 1244.195833],
"http999_ACTION2": [null, 213],
...
}

My desired outcome is a dict like that

{
...
"_ACTION1": [summed up values for all _ACTION1 on any http]
"_ITEM1": [summed up values for all _ITEM1 on any http]
...
}

and so on :-)

something like that I tried

sum(filter(None, chain(*[value for key, value in DICT if key.endswith(('_ACTION1', '_ACTION2', '_ITEM1'))])))

obviously just sums everything up into one single number

Community
  • 1
  • 1
Obre1
  • 35
  • 5

5 Answers5

1
inDict={
"httpXYZ_ACTION1": [10, 0],
"http123_ITEM1": [0.055, 0.0875],
"http456_ACTION1": [0.01824, 0.066667],
"httpABC_ITEM2": [1214.666667, 1244.195833],
"http999_ACTION2": [None, 213],
}
outDictKeys=set('_'+x.split('_')[1] for x in inDict)
outDict={}
for outKey in outDictKeys:
    total=0
    for inKey in inDict:
        if inKey.endswith(outKey):
            total=total+sum([x for x in inDict[inKey] if x is not None])
    outDict[outKey]=total
print (outDict)

Ran in python 3:

>>> ================================ RESTART ================================
>>> 
{'_ITEM1': 0.1425, '_ITEM2': 2458.8625, '_ACTION2': 213, '_ACTION1': 10.084907}
>>> 

Note that I treated your null value as None, which is treated as zero, i.e. ignored. It's up to you how it should be summed.

Emilio M Bumachar
  • 2,532
  • 3
  • 26
  • 30
  • 1
    thank ALL of you for your help this answer solved my issue the others did not becuase import defaultdict is not possible on the server – Obre1 Nov 20 '15 at 11:12
  • my next question would be now where in this construct can I multiply every value of every key by the number 60 ? – Obre1 Nov 20 '15 at 11:31
  • Change the next-to-last line to "outDict[outKey]=total*60". This produces {'_ACTION2': 12780, '_ACTION1': 605.09442, '_ITEM1': 8.549999999999999, '_ITEM2': 147531.75} – Emilio M Bumachar Nov 20 '15 at 11:35
  • thank you for the input once again ! BUT! I found that out by myself just after I asked here! Python is cool :-) – Obre1 Nov 20 '15 at 14:34
0
from collections import defaultdict

input = {
  "httpXYZ_ACTION1": [10, 0],
  "http123_ITEM1": [0.055, 0.0875],
  "http456_ACTION1": [0.01824, 0.066667],
  "httpABC_ITEM2": [1214.666667, 1244.195833],
  "http999_ACTION2": [None, 213],
}

output = defaultdict(float)
for k,v in input.items():
  key = '_' + k.partition('_')[2]
  output[key] += sum((float(val) for val in v if isinstance(val, (int,float))))

print(output)
gahooa
  • 131,293
  • 12
  • 98
  • 101
  • 2
    why `partition('_')[2]` instead of `split('_')[1]`? They're obviously equivalent, but the latter seems simpler to read, and more idiomatic, since you're just throwing away the underscore anyway. – reynoldsnlp Nov 19 '15 at 18:28
  • I think the OP wants to keep the underscore. – NotAnAmbiTurner Nov 19 '15 at 18:29
  • @bebop: you are not correct. `partition` is exactly what we are doing. `split()` requires a 2nd argument of the number of splits if you want to limit it, while partition does that implicitly. In python it's often about using the most clear tool for the job, and in this case, this is the kind of use case `partition` was created for. – gahooa Nov 19 '15 at 18:32
0

No idea where null is coming from but you can use str.find to extract the partial substring and a defaultdict to handle repeating keys:

from collections import defaultdict
dd = defaultdict(float)

for k, v in d.items():
     dd[k[k.find("_"):]] += sum(v)

print(dd)

defaultdict(<class 'float'>, {'_ITEM1': 0.1425, '_ACTION1': 10.084907, '_ACTION2': 213.0, '_ITEM2': 2458.8625})

If null is actually None then filter them out:

dd[k[k.find("_"):]] += sum(filter(None, v))

Or just keep numbers:

 import numbers

 dd[k[k.find("_"):]] + sum(i for i in v if isinstance(i, numbers.Number))
Brent Washburne
  • 12,904
  • 4
  • 60
  • 82
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
0

That's a job for collections.defaultdict, because we need to take care that e.g. at sums['_ACTION1'] exists an initialized float before we can += sth to it, and programmatically ensuring that for a built-in dictionary could cause overhead.

#!/usr/bin/env python3
from collections import defaultdict

DICT = {
    "httpXYZ_ACTION1": [10, 0],
    "http123_ITEM1": [0.055, 0.0875],
    "http456_ACTION1": [0.01824, 0.066667],
    "httpABC_ITEM2": [1214.666667, 1244.195833],
    "http999_ACTION2": [None, 213],
}

sums = defaultdict(lambda: 0.)

# with python2 better use
# for (k, l) in DICT.iteritems():
for (k, l) in DICT.items():
    sums[k[k.find("_"):]] += sum(x for x in l if x is not None)

for pair in sums.items():
    print(pair)

output:

('_ITEM1', 0.1425)
('_ITEM2', 2458.8625)
('_ACTION2', 213.0)
('_ACTION1', 10.084907)
decltype_auto
  • 1,706
  • 10
  • 19
-1

A solution:

null = 0  # null is invalid in Python unless a variable

data = {
    "httpXYZ_ACTION1": [10, 0],
    "http123_ITEM1": [0.055, 0.0875],
    "http456_ACTION1": [0.075, 0.066667],
    "httpABC_ITEM2": [14.666667, 12.195833],
    "http999_ACTION2": [null, 2],
}

categories = set([c.split('_')[-1] for c in data.keys()])
sums = {k: 0 for k in categories}

for k, v in data.items():
    key = k.split('_')[-1]
    if key in categories:
        sums[key] += 1

print sums
Dan
  • 4,488
  • 5
  • 48
  • 75