0

As an example I have a dictionary as follows:

mydict = {A:['asdasd',2,3], B:['asdasd',4,5], C:['rgerg',9,10]}

How can I get just one number that is essentially the sum of the weighted values of all the keys in the dict (this example: A, B, C) weighted by the last number in the list (list[-1])?

So for instance, this would be the sum of:

(2/15)*3 + (4/15)*5 + (9/15)*10 = 7.73

Where 15 is the sum of 2,4,9.

At the moment I'm creating too many variables that just iterate through the dictionary. I'm sure there's a quick efficient Python way to do it.

double-beep
  • 5,031
  • 17
  • 33
  • 41
FancyDolphin
  • 459
  • 1
  • 7
  • 25

2 Answers2

2

Using the fact that (2/15)*3 + (4/15)*5 + (9/15)*10 is just the same as (2*3 + 4*5 + 9*10)/15, you don't have to pre-calculate the total, so you could do this in one pass of the dict with reduce, but perhaps it isn't as readable.

Turning the dict into a series of tuples:

>>> d = {A:['asdasd',2,3], B:['asdasd',4,5], C:['rgerg',9,10]}
>>> from functools import reduce    # Py3
>>> x,y = reduce(lambda t, e: (t[0]+e[0], t[1]+e[1]), ((a*b, a) for _,a,b in d.values()))
>>> x/y   # float(y) in Py2
7.733333333333333

Or without the intermediate generator and using an initial value (probably the fastest):

>>> x,y = reduce(lambda t, e: (t[0]+e[1]*e[2], t[1]+e[1]), d.values(), (0,0))
>>> x/y   # float(y) in Py2
7.733333333333333

Or you could zip up the results and sum:

>>> x,y = (sum(x) for x in zip(*((a*b, a) for _,a,b in d.values())))
>>> x/y   # float(y) in Py2
7.733333333333333

Or you could do what @ranlot suggests though I would avoid the intermediate lists:

>>> sum(a*b for _,a,b in d.values())/sum(a for _,a,b in d.values())
7.733333333333333

Though this seems to be fractionally slower than either of the ones above.

AChampion
  • 29,683
  • 4
  • 59
  • 75
  • Thank you. I really like the first option you gave me. It was the quickest method with the data I'm using. This is what I was after, renlot gave a similar approach I had gone with just a lot more succinctly done. I'll reiterate my query I made to renlot. I'm creating the dictionary on the fly for other purposes and I've just added these variables. For speed sake do you think I should be doing saving them for something other than in the dict? – FancyDolphin Jan 07 '16 at 09:12
  • Without more details it's hard to tell, you could perhaps avoid the adding the values to the dict and then calculating by just calculating the results on the fly. – AChampion Jan 07 '16 at 14:22
  • Out of curiosity, I've been playing with it and not figuring it our. Syntax wise how would I, using your reduce lamdba method, include an if statement if say e[0]=='asdasd'? – FancyDolphin Jan 08 '16 at 01:52
  • I've tried something like: x,y = reduce(lambda t, e: True if e[0]=='asdasd' (t[0]+e[1]*e[2], t[1]+e[1]) else False, dict.values(), (0,0)) – FancyDolphin Jan 08 '16 at 02:04
  • I'm not quite sure what you are trying to achieve, it might be worth asking a separate question describing the problem and your expected output. Are you just trying to `filter` on 'asdasd'? – AChampion Jan 08 '16 at 06:27
  • Yea I was just wanting to filter, but if you think I should ask another question I will. If I use ranlot's example it would be the equivalent of: normConst = sum(dict[x][-2] for x in dict if dict[x][0]=='asdasd') print sum(dict[x][-1]*dict[x][-2] / float(normConst) for x in dict if dict[x][0]=='asdasd') – FancyDolphin Jan 08 '16 at 06:47
  • `x, y = reduce(lambda v, e: (v[0]+e[1]*e[2], v[1]+e[1]), filter(lambda e: e[0] == 'asdasd', d.values()), (0,0))` – AChampion Jan 08 '16 at 06:51
1
myDict = {'A':['asdasd', 2, 3], 'B':['asdasd', 4, 5], 'C':['rgerg', 9, 10]}
normConst = sum([dict[x][-2] for x in myDict])
sum([dict[x][-1]*dict[x][-2] / float(normConst) for x in myDict])
ranlot
  • 636
  • 1
  • 6
  • 14
  • you could use `sum(..)` instead of `sum([..])`. You could use `for v in some_dict.values()`, to get values. Don't use `dict` as a variable name, it shadows the builtin. – jfs Jan 07 '16 at 06:34
  • A question about this: In terms of speed, I'm still essentially having to iterate of the dictionary, like when I use iteritems for example. Since I'm creating the dictionary, should I be storing the 2 and 3's in a vector as I read down the file that is creating the dictionary/list, or is there another way like a running tally that would be faster? – FancyDolphin Jan 07 '16 at 06:38