Weighted averaging a list

Question

Thanks for your responses. Yes, I was looking for the weighted average.

rate = [14.424, 14.421, 14.417, 14.413, 14.41]

amount = [3058.0, 8826.0, 56705.0, 30657.0, 12984.0]

I want the weighted average of the top list based on each item of the bottom list.

So, if the first bottom-list item is small (such as 3,058 compared to the total 112,230), then the first top-list item should have less of an effect on the top-list average.

Here is some of what I have tried. It gives me an answer that looks right, but I am not sure if it follows what I am looking for.

for g in range(len(rate)):
    rate[g] = rate[g] * (amount[g] / sum(amount))
rate = sum(rate)

EDIT: After comparing other responses with my code, I decided to use the zip code to keep it as short as possible.

Do you mean [weighted average](http://en.wikipedia.org/wiki/Weighted_arithmetic_mean) — , Mar 29 '15 at 15:11
@Pyson None of these lists seem to have a sum of 100 percent, so I'm not sure about that. — Malik Brahimi, Mar 29 '15 at 15:14
If you are looking for a weighted average as @Pyson mentioned, a good idea is to normalise the second vector, and apply the w.a algorithm — srj, Mar 29 '15 at 15:18

score 52 · Answer 1 · answered Mar 29 '15 at 15:28

52

You could use numpy.average to calculate weighted average.

In [13]: import numpy as np

In [14]: rate = [14.424, 14.421, 14.417, 14.413, 14.41]

In [15]: amount = [3058.0, 8826.0, 56705.0, 30657.0, 12984.0]

In [17]: weighted_avg = np.average(rate, weights=amount)

In [19]: weighted_avg
Out[19]: 14.415602815646439

answered Mar 29 '15 at 15:28

Akavall

82,592
51
207
251

1

Thanks, but I am trying to use included 2.7.9 libraries. – Rontron Mar 29 '15 at 15:40
1

Numpy is _de facto_ standard library – Mayou36 Nov 18 '20 at 18:21
1

*de facto* is subjective. Numpy is **not** standard library. – John Scolaro Oct 13 '22 at 03:54

score 28 · Accepted Answer · edited May 13 '19 at 14:48

28

for g in range(len(rate)):
   rate[g] = rate[g] * amount[g] / sum(amount)
rate = sum(rate)

is the same as:

sum(rate[g] * amount[g] / sum(amount) for g in range(len(rate)))

which is the same as:

sum(rate[g] * amount[g] for g in range(len(rate))) / sum(amount)

which is the same as:

sum(x * y for x, y in zip(rate, amount)) / sum(amount)

Result:

14.415602815646439

edited May 13 '19 at 14:48

Paul

5,473
1
30
37

answered Mar 29 '15 at 15:19

JuniorCompressor

19,631
4
30
57

Thanks, this worked. The one highlighted in yellow gave me a syntax error though. – Rontron Mar 29 '15 at 15:41
I tried it again, it worked this time. I probably copied something extra on the page accidentally. I will use your yellow-highlighted code. Thanks! – Rontron Mar 29 '15 at 15:46
I would highly discourage this answer in favor of the other proposed `np.average`: numpy belongs de facto to the standard library. And then, if we already have an implementation, let's not reinvent the wheel (not even talking about the speed) – Mayou36 Nov 18 '20 at 18:25

maahl · Answer 3 · 2015-12-07T05:05:43.563

9

This looks like a weighted average.

values = [1, 2, 3, 4, 5]
weights = [2, 8, 50, 30, 10]

s = 0
for x, y in zip(values, weights):
    s += x * y

average = s / sum(weights)
print(average) # 3.38

This outputs 3.38, which indeed tends more toward the values with the highest weights.

edited Dec 07 '15 at 05:05

answered Mar 29 '15 at 15:17

maahl

547
3
17

score 2 · Answer 4 · answered Mar 29 '15 at 15:25

Let's use python zip function

zip([iterable, ...])

This function returns a list of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables. The returned list is truncated in length to the length of the shortest argument sequence. When there are multiple arguments which are all of the same length, zip() is similar to map() with an initial argument of None. With a single sequence argument, it returns a list of 1-tuples. With no arguments, it returns an empty list.

weights = [14.424, 14.421, 14.417, 14.413, 14.41]
values = [3058.0, 8826.0, 56705.0, 30657.0, 12984.0]
weighted_average = sum(weight * value for weight, value in zip(weights, values)) / sum(weights)

You have the weights and values swapped. I want the 14.000 values to be weighted based on thousand values. — Rontron, Mar 29 '15 at 15:27

score 0 · Answer 5 · answered Nov 22 '21 at 23:20

As a documented and tested function:

def weighted_average(values, weights=None):
    """
    Returns the weighted average of `values` with weights `weights`
    Returns the simple aritmhmetic average if `weights` is None.
    >>> weighted_average([3, 9], [1, 2])
    7.0
    >>> 7 == (3*1 + 9*2) / (1 + 2)
    True
    """
    if weights == None:
        weights = [1 for _ in range(len(values))]
    normalization = 0
    val = 0
    for value, weight in zip(values, weights):
        val += value * weight
        normalization += weight
    return val / normalization

For completeness another version where the values and weights are stored in tuples:

def weighted_average(values_and_weights):
    """
    The input is expected in the form:
        [(value_1, weight_1), (value_2, weight_2), ...(value_n, weight_n)]
    >>> weighted_average([(3,1), (9,2)])
    7.0
    >>> 7 == (3*1 + 9*2) / (1 + 2)
    True

    """
    normalization = 0
    val = 0
    for value, weight in values_and_weights:
        val += value * weight
        normalization += weight
    return val / normalization

Weighted averaging a list

5 Answers5

Linked