Minimizing Least Squares Subject To Constraint

Question

I am trying to rate NFL teams by minimizing the sum of squared errors subject to a constraint. My data looks like:

dat = {"Home_Team": ["KC Chiefs", "LA Chargers", "Baltimore Ravens"],
       "Away_Team": ["Houston Texans", "Miami Dolphins", "KC Chiefs"],
       "Home_Score": [34, 20, 20],
       "Away_Score": [20, 23, 34],
       "Margin": [14, -3, -34]
      }
df = pd.DataFrame(dat)
df

    Home_Team        Away_Team      Home_Score  Away_Score  Margin
0   KC Chiefs        Houston Texans 34          20          14
1   LA Chargers      Miami Dolphins 20          23          -3
2   Baltimore Ravens KC Chiefs      20          34          -34

The Margin is Margin = Home_Score - Away_Score. My goal is to come up with a numerical rating for each team such that the average of all the teams' ratings is zero. Hence, if the Chiefs have a rating of 3.0, then they are 3 points better than the average team.

Given these ratings, we generate predictions in this way: the home team's predicted margin of victory is Home_Edge + Home_Rating - Away_Rating, where Home_Edge is the home field advantage (a constant for all home teams), Home_Rating is the home team's rating, and Away_Rating the away team's rating.

The error in a prediction is prediction - Margin, and I want to minimize the sum of squares of these errors. I am trying to do this using scipy.optimize in the following way:

# Our objective function, where x is our array of parameters, 
# x[0] is the home edge, x[1] the home rating, and x[2] the away rating
# Y is the true, observed margin
def obj_fun(x, Y):
    y = x[0] + x[1] - x[2]
    return np.sum((y - Y)**2)

# Define the constraint function. We have that the ratings average to 0
def con(x):
    return np.mean(x[1])

# Constraint dictionary
cons = {'type': 'eq', 'fun': con}

# Minimize sum of squared errors
from scipy import optimize

# Initial guesses (numbers I randomly thought of in my head)
home_edge = 0.892
home_ratings = np.array([1.46, 9.67, -0.82])
away_ratings = np.array([-3.10, -6.57, 1.46])
x_init = [np.repeat(home_edge, 3), home_ratings, away_ratings]

# Minimize
results = optimize.minimize(fun = obj_fun, args = (df["Margin"]), 
x0 = x_init, constraints = cons)

print(results.x)
[-2.9413615   0.          4.72534244  1.46        9.67       -0.82
 -3.1        -6.57        1.46      ]

I don't know what's going wrong here, but I want my output to have 6 solutions, not 9. One for the home edge, and the remaining five for each team. What's going wrong? Thank you!

The text, the code and the expectations do not fit together imho. Dimensions are completely off in regards to the task i would say. Your objective just uses three variables. Even if you would want to optimize over 33 vars, your could would never do that. *There is a very basic ingredient missing: generating the pairwise prediction-errors.* (at least assuming you didn't already duplicate/expanded the pd-df; hard to read from the question) `x_init` which must be `1-d` (or else scipy produces gargabe with high probability) is probably problematic too, but i can't infer that without executing it. — sascha, Jan 08 '21 at 18:04
I see. So I need to be generating pairwise prediction errors in the objective function, or where? How would you go about solving some of these problems, and would it be more helpful if I created an example that you could execute? — Jake, Jan 08 '21 at 18:08
Depends on what you actually want to optimize. I guess you don't want to put some value to `T_1` without evaluating the effects on all other teams `T_x` in regards to observations. Most importantly, you should improve the description. I don't get it and then it's hard to reason about it. I already got the feeling you are showing columns which aren't used at all (rating vs.scores?). So yeah: there might be theory as well as implementation problems (the latter here on SO: nearly always boiling down to not assuming 1-d vector of variables which is a must!) . — sascha, Jan 08 '21 at 18:11
The procedure I'm following is described on pages 285-288 of this pdf (not the literal page numbers in the book!): http://stavochka.com/files/Mathletics.pdf The author uses Excel, I am essentially trying to do the same in Python. — Jake, Jan 08 '21 at 18:29
@sascha I have created a reproducible example and edited the text. I hope things are clearer for you and any other reader now. — Jake, Jan 08 '21 at 20:26

Minimizing Least Squares Subject To Constraint

0 Answers0