3

I am trying to implement sklearn's lasso in my code. To test it out, I have decided to make a test with alpha = 0. This should by definition yield the same result as LinearRegression, but this is not the case.
Here is the code:

import pandas as pd
from sklearn.linear_model import Lasso
from sklearn.linear_model import LinearRegression

# Don't worry about this. It is made so that we can work with the same dataset.
df = pd.read_csv('http://web.stanford.edu/~oleg2/hse/Credit.csv').dropna()
df['Asian'] = df.Ethnicity=='Asian'
df['Caucasian'] = df.Ethnicity=='Caucasian'
df['African American'] = df.Ethnicity=='African American'
df = df.drop(['Ethnicity'],axis=1).replace(['Yes','No','Male','Female',True,False],[1,0,1,0,1,0])
# End of unimportant part

ft = Lasso(alpha=0).fit(x, df.Balance)
print(ft.intercept_)
ft = LinearRegression().fit(x, df.Balance)
print(ft.intercept_)

Output:

-485.3744897927978
-480.89071679937786

The coef_s are also all different.

What am I doing wrong?

user9102437
  • 600
  • 1
  • 10
  • 24

1 Answers1

0

Indeed this seems to yield different results. However, running your code, also yielded the following warning:

ft = Lasso(alpha=0).fit(X, y)
print(ft.intercept_)
ft = LinearRegression().fit(X, y)
print(ft.intercept_)

-485.3744897927984
-480.89071679937854 

UserWarning: With alpha=0, this algorithm does not converge well. You are advised to use the LinearRegression estimator

This is letting you know that since alpha=0, meaning that we're only left with an ordinary linear regression, the algorithm won't converge well. Which is why you see a difference in the intercept, and presumably a worsening of the metrics.

yatu
  • 86,083
  • 12
  • 84
  • 139
  • Can you tell us what a "Coordinate descent" used in Lasso is? – Sergey Bushmanov Nov 05 '20 at 18:50
  • Not sure what the actual optimization algorithm used in lasso is, no @sergey – yatu Nov 05 '20 at 18:53
  • They are different because LinearRegression is least squares matrix algebra whereas Lasso is coordinate descent. This is why they advise to use LinearRegression (for numerical convergence reasons) when α=0. What is more interesting is what coordiante decent is and why its so different – Sergey Bushmanov Nov 05 '20 at 18:56
  • @yatu, as far as I know from theory, the value should be exact, since the penalty is zero. But it doesn't matter anyway. The coefficients with other alphas it gives are also all over the place: https://ibb.co/gDRfgnr – user9102437 Nov 05 '20 at 19:03
  • This is also indicated in the [docs](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html#sklearn.linear_model.Lasso):"For numerical reasons, using `alpha = 0` with the `Lasso` object is not advised. Given this, you should use the `LinearRegression` object." – desertnaut Nov 05 '20 at 19:10
  • Hmm. Also the algos have to be different since linear regression will be solved via de closed solution ols, whereas lasso needs to be optimizered iteratively. So at best it'll be as good. However there seems to be other reasons where it isn't adviced in this case – yatu Nov 05 '20 at 19:10