I am aware of the standard process of finding the optimal value of alpha/lambda using Cross Validation technique through GridSearchCV
class in sklearn.model_selection
library.Here's my code to find that .
alphas=np.arange(0.0001,0.01,0.0005)
cv=RepeatedKFold(n_splits=10,n_repeats=3, random_state=100)
hyper_param = {'alpha':alphas}
model = Lasso()
model_cv = GridSearchCV(estimator = model,
param_grid=hyper_param,
scoring='r2',
cv=cv,
verbose=1,
return_train_score=True
)
model_cv.fit(X_train,y_train)
#checking the bestscore
model_cv.best_params_
This gives me alpha=0.01
Now, looking on LassoCV
, as per my understanding , this library creates model by selecting best optimal alpha
by the passed alphas
list, and please note , I have used the same cross validation scheme for both of them. But when trying sklearn.linear_model.LassoCV
with RepeatedKFold cross validation scheme.
alphas=np.arange(0.0001,0.01,0.0005)
cv=RepeatedKFold(n_splits=10,n_repeats=3,random_state=100)
ls_cv_m=LassoCV(alphas,cv=cv,n_jobs=1,verbose=True,random_state=100)
ls_cv_m.fit(X_train_reduced,y_train)
print('Alpha Value %d'%ls_cv_m.alpha_)
print('The coefficients are {}',ls_cv_m.coef_)
I get alpha=0
for the same data and this alpha value in not present in the list of decimal values passed in alphas
argument for this.
This has confused me about the actual implementation of LassoCV
.
and my doubts are ..
- Why do I get optimal alpha as
0
inLassoCV
when the list passed to the argument does not haszero
in it. - What is the difference between
LassoCV
andLasso
then, if I have to anyways find most suitable alpha fromGridSearchCV
only?