I'm trying to figure out how to predict values with LASSO regression without using the .predict
function that Sklearn provides. This is basically just to broaden my understanding of how LASSO works internally. I asked a question on Cross Validated about how LASSO regression works, and one of the comments mentioned how the predict function works the same as in Linear Regression. Because of this, I wanted to try and make my own function to do this.
I was able to successfully recreate the predict function in simpler examples, but when I try to use it in conjunction with RobustScaler
, I keep getting different outputs. With this example, I'm getting the prediction as 4.33 with Sklearn, and 6.18 with my own function. What am I missing here? Am I not inverse transforming the prediction correctly at the end?
import pandas as pd
from sklearn.preprocessing import RobustScaler
from sklearn.linear_model import Lasso
import numpy as np
df = pd.DataFrame({'Y':[5, -10, 10, .5, 2.5, 15], 'X1':[1., -2., 2., .1, .5, 3], 'X2':[1, 1, 2, 1, 1, 1],
'X3':[6, 6, 6, 5, 6, 4], 'X4':[6, 5, 4, 3, 2, 1]})
X = df[['X1','X2','X3','X4']]
y = df[['Y']]
#Scaling
transformer_x = RobustScaler().fit(X)
transformer_y = RobustScaler().fit(y)
X_scal = transformer_x.transform(X)
y_scal = transformer_y.transform(y)
#LASSO
lasso = Lasso()
lasso = lasso.fit(X_scal, y_scal)
#LASSO info
print('Score: ', lasso.score(X_scal,y_scal))
print('Raw Intercept: ', lasso.intercept_.round(2)[0])
intercept = transformer_y.inverse_transform([lasso.intercept_])[0][0]
print('Unscaled Intercept: ', intercept)
print('\nCoefficients Used: ')
coeff_array = lasso.coef_
inverse_coeff_array = transformer_x.inverse_transform(lasso.coef_.reshape(1,-1))[0]
for i,j,k in zip(X.columns, coeff_array, inverse_coeff_array):
if j != 0:
print(i, j.round(2), k.round(2))
#Predictions
example = [[3,1,1,1]]
pred = lasso.predict(example)
pred_scal = transformer_y.inverse_transform(pred.reshape(-1, 1))
print('\nRaw Prediction where X1 = 3: ', pred[0])
print('Unscaled Prediction where X1 = 3: ', pred_scal[0][0])
#Predictions without using the .predict function
def lasso_predict_value_(X1,X2,X3,X4):
print('intercept: ', intercept)
print('coef: ', inverse_coeff_array[0])
print('X1: ', X1)
preds = intercept + inverse_coeff_array[0]*X1
print('Your predicted value is: ', preds)
lasso_predict_value_(3,1,1,1)