3

I am building an algorithm to predict the outcomes of sporting events using the performance of previous games. For example, I might have two lists that look like this:

# list of game numbers
game_number = [1, 2, 3, 4, 5, 6, 7] 

# list of points scored   
points_scored = [100, 106, 99, 106, 89, 94, 113]

I can easily calculate a mean using:

# calculate mean
mean_points_scored = np.mean(points_scored)

However, I want the more recent games to be weighted more heavily in the calculation of the mean. Does anyone have experience doing this?

martineau
  • 119,623
  • 25
  • 170
  • 301
Aaron England
  • 1,223
  • 1
  • 14
  • 26

3 Answers3

4

You can do weighted averages with np.average

mean_points_scored = np.average(points_scored, weights=game_number)
wilkben
  • 657
  • 3
  • 12
3

I think the weigh has to be define in a different array :

weights_define = [1, 1, 1, 1, 1, 2, 3]
mean_points_scored = np.average(points_scored, weights=weights_define)  

Because the way that wilkben defined it is not accurate and too much exagerate and not mathematic at all !

You can check an Excel explaination which explained how maths really works (codes is basically math don't forget !) --> Excel Debunk

veksor
  • 302
  • 1
  • 7
1

Definition of weights may be based on certain defined criteria as below. The factors of x may be changed or number of parts of weights list may vary depending on requirement. Assuming three parts a,b,c and 15 data points,( the factors of x are assumed to be larger towards end part of weights list, as it is given for recent games more heavy weights be given)

a = [(3*x) for x in range(1,6)]
b = [(4*x) for x in range(6,11)]
c = [(7*x) for x in range(11,16)]

weights_define = a+b+c

game_number = [1, 2, 3, 4, 5, 6, 7,8,9,10,11,12,13,14,15]
points_scored = [100, 106, 99, 106, 89, 94, 113, 112,109,111,97,95,102,107,103]

 mean_points_scored = np.average(points_scored, weights=weights_define)  
 print(mean_points_scored)

Output:

102.77878787878788
RAGHHURAAMM
  • 1,099
  • 7
  • 15