Implementing a weighted average on the fly in Python

Question

I have a stream of coming data and I want to implement the moving average on the fly. If all the elements in the moving average have the same weight it is fairly easy to implement using a 'Queue' but I want the most recent elements to have higher weights and the distribution of this weights are linear (not exponential).

For example, if the moving average is of length 5, the current value should have weight '1', the previous one should have weight '0.8' and so on until the fifth element in the queue which should have weight '0.2'; so the weight vector is: [0.2, 0.4, 0.6, 0.8, 1.0].

I was wondering if anybody knows how to implement it is Python. If there is any faster way to do this please recommend that to me; efficiency is important for my specific job.

Just to be sure, do you want to have an estimation of the average as soon as an element is available and update is each time a new element entered? — Uncle Ben Ben, Jun 27 '18 at 17:35
Yes. I wanted to output the updated moving average as soon as it arrives. — user3658421, Jun 29 '18 at 03:47

score 0 · Accepted Answer · answered Jul 03 '18 at 13:18

If you want to keep a weighting vector such as described (linearly decreasing weights), you will need to keep all the information about your past stream. I quickly tried to sketch a mathematical function that would avoid keeping your past scalars in your memory without success. This is where the exponential weighting has a powerful advantage:

average_(t) = x_(t) + aa*average_(t-1)

You only need to keep two variables in your memory.

Anyway, if the memory is not an efficiency parameter, your problem goes down as a vector multiplication. Therefore, I would suggest using the numpy library. [1] [2]. See a solution example below (perhaps you may found a more efficient one):

import numpy as np

stream = np.array((20, 40))
n = len(stream)

latest_scalar = 60
stream = np.append(stream, latest_scalar)

n += 1
# n represent the length of the stream
# I assumed that is more efficient to handle n without calling len() function
# may raise safety issue
weights = np.arange(1, n+1)
# [1, 2, 3]
average = np.dot(stream, weights).sum() / (n*(n+1)/2)
# (n*(n+1)/2): total of the weights
# output: 46.666... ok!

Implementing a weighted average on the fly in Python

1 Answers1