Apply LASSO in R using glmnet package for cox model

Question

I want to perform LASSO for cox ph model in R for variable selection. Somewhere, I found this code and done my analysis, somewhere else I found it is for elastic net, someone please confirm I am using the right code.

lasso<- cv.glmnet(xmat, ysurv, alpha = 1, family = 'cox', nfolds = 30)

I'm not 100% sure what you're asking, but `alpha=1` does specify LASSO (a special case of elastic net). — Ben Bolker, Feb 05 '23 at 14:24
@BenBolker, Yes I wanted to asked this for confirmation, somone used alpha 0.95, what does this would mean then? — Basheer, Feb 05 '23 at 15:24
@BenBolker, I also want to know, most of the researchers used 10 fold cross validation, using more or less than 10 is a good idea or not — Basheer, Feb 05 '23 at 15:29

score 0 · Accepted Answer · answered Feb 05 '23 at 15:37

The help page for cv.glmnet() (type ?cv.glmnet in R or go through the help system in RStudio) isn't useful because the alpha parameter is passed through to glmnet().

alpha: The elasticnet mixing parameter, with 0<=alpha<= 1. The penalty is defined as

(1-alpha)/2||beta||_2^2+alpha||beta||_1.

‘alpha=1’ is the lasso penalty, and ‘alpha=0’ the ridge penalty.

So alpha=1 is lasso (as described there), and alpha=0.95 is a mixture that is mostly lasso (L1) with a little bit of ridge (L2) mixed in.

I doubt there's much of a difference between 10-fold and 30-fold cross-validation: the reasons you might want to choose different numbers of folds are (1) computational efficiency (computation goes up with number of folds unless there is some trick for computing CV score without refitting the model, as is often the case for LOOCV) and (2) bias-variance tradeoff; see section 5.1.4 of Introduction to Statistical Learning with R.

Follow-up questions that are more statistical or data-sciencey than computational should probably go to CrossValidated.

Apply LASSO in R using glmnet package for cox model

1 Answers1