I'm attempting to use LASSO for a slightly different function than originally designed. I have a 22 different tasks on a test, which, when averaged, produce a final score. I want to see which combination of a limited number of tasks would best predict the overall score with the hopes of creating a short form of this test.
I'm using glmnet to run the lasso next, and it runs as expected. I can then easily find the model at a given lamda with
coef(cvfit, s = s)
However, I am wondering if it would be possible to specify the n of predictors that have non-zero coefficients, rather than the penalization parameter?
I've set up a very inefficient way of doing this as shown below by extracting the models from a grid of test lambdas, but I was wondering if there is a more efficient way of doing this
nvar <- list()
coeffs <- list()
for(j in 1:20000) {
s <- j / 20000
coeffs[j] <- coef(cvfit, s = s) ##Get coefficient list at given lamda
nvar[j] <- sum(as.vector(coef(cvfit, s = s)) != 0) - 1 ##Count number of variables with non-zero coeff and subtract one because intercept is always non-zero
}
nvar <- unlist(nvar)
getlamda <- function(numvar = 4) {
min.lambda <- min(lambdas[nvar == numvar]) / 20000 ##Find the smallest lambda which resulted in the given number of non-zero coefficients
coeffs[min.lambda]
}