Automatic scaling of predictors in glmnet

Question

In An Introduction to Statistical Learning, James and colleagues state

"In contrast, the ridge regression coefficient estimates can change substantially when multiplying a given predictor by a constant. Therefore, it is best to apply ridge regression after standardizing the predictors."

I am using the glmnet package to conduct ridge and lasso regression, however none of the predictors that were highly significant predictors in a backwards stepwise regression are greater than zero using the glmnet() and cv.glmnet() functions. I am willing to accept that the stepwise regression may have delivered spurious results (there are MANY posts warning against it), however I just wanted to make certain that the lack of even a single non-zero predictor in the lasso procedure was due to the flaws in stepwise regression rather than some scaling error on my part.

I have read that the glmnet function scales and then unscales predictors automatically, 'under the hood' as it were. Can anyone verify this?

This is a question more suited for cross validated. In general when there are many correlated predictors lm can't function properly, coefficients get inflated and so can the significance's. If the glmnet model predicts better then the lm model (in validation/cross validation) I wouldn't worry. The part about scaling / un-scaling the predictors and glmnet is true. One can turn it off but it is not recommended. — missuse, Oct 23 '17 at 07:59
Thank you @missuse. Yes I did think about posting in CV but it seemed (ever so) slightly more software-related. Ok so it's not the scaling in `glmnet()` that's causing the absence of predictors in the lasso, probably just the shortcomings of stepwise regression. — llewmills, Oct 23 '17 at 08:57
One can turn of the scaling by `standardize = FALSE` in glmnet call, I suggest your check that also, but I do not think its is the cause for the discrepancy in selected variables. When variables are categorical, or when they are one the same scale `standardize = FALSE` might be a good idea. I would also check the correlation between the predictors. — missuse, Oct 23 '17 at 09:02
Some are highly correlated but conceptually distinct so it's difficult to justify removing them from the model. What does one do in that situation? — llewmills, Oct 23 '17 at 09:06
That is a tough question to answer and the answer would probably be opinion based. What I tend to do is to make a model that performs the best (in terms of prediction of unseen data) all else should go in the comments about the model. — missuse, Oct 23 '17 at 09:12

Automatic scaling of predictors in glmnet

0 Answers0