I executed solver function in Excel to minimize variables and get success. We can see how I made it below:
- First, we got a table of matches with "Away Team", "Away Pts", "Home Team", "Home Pts", "Game Total", "Home MoV":
- Then, we get variables that needs to be minimized for each team:
- Here is the set up of the Solver Function that I want to reproduce in Python:
In Python, I imported the table with "Away Team", "Away Pts", "Home Team", "Home Pts", "Game Total" and "Home MoV" information, then I create more two columns related to variables that will be minimized. In first column added, the variables are related to a Home Team Strength (ht1, ht2, ..., ht18), equivalent to variables in column AB in Excel, in second column, there're Away Team Strength variables (at1, at2, ..., at18) that is equivalent to column AC in Excel.
They will be used in formulas to calculate Parameter Estimate for home and away teams like used to do in Excel:
- PEhome = Hadj + ht[i] - at[i];
- PEaway = Aadj + ht[i] - at[i]
Where Hadj is "Home Adjustment" and Aadj is "Away Adjustment".
- How can I get Hadj, Aadj and all ht[i] and at[i] solved applying OLS minimization?
My biggest challenge here is to know how I can setup this in python, if is really needed to add the columns with the variables, and how can I organize all this to use scipy.minimization.
note: I already solve variables before utilizing "np.linalg.lstsq" that so much easier and simple, but at this time, I have variables that changes in each row of a pandas data frame. My doubt is how can I setup all this to get this values solved. So, I tried to apply "scipy.minimization", without success, to do this task. But this isn't organized in right way. Code below.
# Variables to be minimized
'''Get unique variables in columns 'HT' and 'AT', sort items, and append more two variables that will be also minimized: 'Hadj', and 'Aadj', that is home and away adjustments.'''
var = np.concatenate([games['HT'].unique(), games['AT'].unique()])
var = sorted(var, key=lambda x: int("".join([i for i in x if i.isdigit()])))
var = np.append(var, ['Hadj','Aadj'])
# Initial values for var to be minimized
initial = np.repeat(10., var.size)
# Ratings for home and away team
def func(coeffs, var, games):
# Parameter Estimate for Home and Away teams:
lookup = dict()
for i in range(len(var)):
lookup[var[i]] = coeffs[i]
games = games.replace(lookup)
'''functions = Home Adjustment + home team - away team
Away Adjustment + home team - away team'''
peH = games['Hadj'] + games['HT'] - games['AT']
peA = games['Aadj'] + games['HT'] - games['AT']
# EXP Function
expFH = math.exp(peH) / (1 + math.exp(peH))
expFA = math.exp(peA) / (1 + math.exp(peA))
# Z Score
homeZs = norm.cdf(expFH)
awayZs = norm.cdf(expFA)
# Estimated Points
estHP = avgHP + (homeZs * stdevHP)
estAP = avgAP + (awayZs * stdevAP)
# Errors Sq for each teams, then sum them all
homeEsq = np.array((estHP - games['Home Pts'])**2).sum()
awayEsq = np.array((estAP - games['Away Pts'])**2).sum()
totalEsq = homeEsq + awayEsq
return totalEsq
res = minimize(func, x0=initial, args=(var, games))