Calculating a mean with more recent observations with greater importance

Question

I am building an algorithm to predict the outcomes of sporting events using the performance of previous games. For example, I might have two lists that look like this:

# list of game numbers
game_number = [1, 2, 3, 4, 5, 6, 7] 

# list of points scored   
points_scored = [100, 106, 99, 106, 89, 94, 113]

I can easily calculate a mean using:

# calculate mean
mean_points_scored = np.mean(points_scored)

However, I want the more recent games to be weighted more heavily in the calculation of the mean. Does anyone have experience doing this?

[numpy weighted average](https://docs.scipy.org/doc/numpy-1.14.2/reference/generated/numpy.ma.average.html) — Kurtis Streutker, Aug 21 '19 at 18:32

score 4 · Accepted Answer · answered Aug 21 '19 at 18:24

4

You can do weighted averages with np.average

mean_points_scored = np.average(points_scored, weights=game_number)

answered Aug 21 '19 at 18:24

wilkben

657
3
12

score 3 · Answer 2 · answered Aug 21 '19 at 18:30

I think the weigh has to be define in a different array :

weights_define = [1, 1, 1, 1, 1, 2, 3]
mean_points_scored = np.average(points_scored, weights=weights_define)

Because the way that wilkben defined it is not accurate and too much exagerate and not mathematic at all !

You can check an Excel explaination which explained how maths really works (codes is basically math don't forget !) --> Excel Debunk

RAGHHURAAMM · Answer 3 · 2019-08-22T05:04:05.450

Definition of weights may be based on certain defined criteria as below. The factors of x may be changed or number of parts of weights list may vary depending on requirement. Assuming three parts a,b,c and 15 data points,( the factors of x are assumed to be larger towards end part of weights list, as it is given for recent games more heavy weights be given)

a = [(3*x) for x in range(1,6)]
b = [(4*x) for x in range(6,11)]
c = [(7*x) for x in range(11,16)]

weights_define = a+b+c

game_number = [1, 2, 3, 4, 5, 6, 7,8,9,10,11,12,13,14,15]
points_scored = [100, 106, 99, 106, 89, 94, 113, 112,109,111,97,95,102,107,103]

 mean_points_scored = np.average(points_scored, weights=weights_define)  
 print(mean_points_scored)

Output:

102.77878787878788

Calculating a mean with more recent observations with greater importance

3 Answers3

Linked