I am given data which consists of X and Y points (x_1,...x_n; y1,...y_n)
.
I want to fit X to Y using two basis functions: max(x,mu_1)
and min(x,mu_2)
In other words I want to estimate the following equation:
y_i = a_1*max(x_i,mu_1)+a_2*min(x_i,mu_2)
I want to find mu_1
and mu_2
such that the fit above is best possible. I mean such mu_1
and mu_2
so that when I fit Y to X sum of squared residual is minimized.
Or I could say that I need a_1
, a_2
, mu_1
, mu_2
such that the sum of squared residuals for the fit above is minimized.
I tried to do the following:
I created the function of two arguments (mu_1 and mu_2)
that returns the quality of the fit of Y to X. And then I tried to optimize this function using scipy.optimize.minimize
. Here is code:
import numpy as np
from scipy.optimize import minimize
from sklearn.linear_model import LinearRegression
###Create X and Y
X = np.random.normal(10,1,size = 10000)
Y = np.random.normal(20,1,size = 10000)
###Create function that estimates quality of fit
def func(mu_1,mu_2):
### basis functions
regressor_1 = np.maximum(X,mu_1).reshape(-1,1)
regressor_2 = np.minimum(X,mu_2).reshape(-1,1)
x_train = np.hstack((regressor_1,regressor_2))
model = LinearRegression().fit(x_train,Y)
###I didnt find how to extract sum of squared residual, but I can get R
squared, so I thought that minimizing SSR is the same as maximizing R
squared and it is the same as minimizing -R^2
objective = model.score(x_train,Y)
return -1*objective
### Now I want to find such mu_1 and mu_2 that minimize "func"
minimum = minimize(func,0,0)
minimum.x
It doesnt work. I will really appreciate any help.