How to calculate the cross-validated R2 on a LASSO regression?

Question

I am using this code to fit a model using LASSO regression.

library(glmnet)
IV1 <- data.frame(IV1 = rnorm(100))
IV2 <- data.frame(IV2 = rnorm(100))
IV3 <- data.frame(IV3 = rnorm(100))
IV4 <- data.frame(IV4 = rnorm(100))
IV5 <- data.frame(IV5 = rnorm(100))
DV <- data.frame(DV = rnorm(100))

data<-data.frame(IV1,IV2,IV3,IV4,IV5,DV)

x <-model.matrix(DV~.-IV5 , data)[,-1]
y <- data$DV

AB<-glmnet(x=x, y=y, alpha=1)
plot(AB,xvar="lambda")

lambdas = NULL
for (i in 1:100)
{
  fit <- cv.glmnet(x,y)
  errors = data.frame(fit$lambda,fit$cvm)
  lambdas <- rbind(lambdas,errors)
}

lambdas <- aggregate(lambdas[, 2], list(lambdas$fit.lambda), mean)


bestindex = which(lambdas[2]==min(lambdas[2]))
bestlambda = lambdas[bestindex,1]


fit <- glmnet(x,y,lambda=bestlambda)

I would like to calculate some sort of R2 using the training data. I assume that one way to do this is using the cross-validation that I performed in choosing lambda. Based off of this post it seems like this can be done using

r2<-max(1-fit$cvm/var(y))

However, when I run this, I get this error:

Warning message:
In max(1 - fit$cvm/var(y)) :
no non-missing arguments to max; returning -Inf

Can anyone point me in the right direction? Is this the best way to compute R2 based off of the training data?

Carlos Santillan · Answer 1 · 2018-06-08T20:15:53.140

1

The function glmnet does not return cvm as a result on fit

?glmnet

What you want to do is use cv.glmnet

?cv.glmnet

The following works (note you must specify more than 1 lambda or let it figure it out)

fit <- cv.glmnet(x,y,lambda=lambdas[,1])

r2<-max(1-fit$cvm/var(y))

I'm not sure I understand what you are trying to do. Maybe do this?

for (i in 1:100)
{
  fit <- cv.glmnet(x,y)
  errors = data.frame(fit$lambda,fit$cvm)
  lambdas <- rbind(lambdas,errors)
  r2[i]<-max(1-fit$cvm/var(y))
}

lambdas <- aggregate(lambdas[, 2], list(lambdas$fit.lambda), mean)


bestindex = which(lambdas[2]==min(lambdas[2]))
bestlambda = lambdas[bestindex,1]
r2[bestindex]

edited Jun 08 '18 at 20:15

answered Jun 08 '18 at 19:39

Carlos Santillan

1,077
7
8

Thanks very much for the reply! Do you mind elaborating on how this works with two lambdas? Does the eventual model have two different lambdas that get optimized separately? – Dave Jun 08 '18 at 19:43
Thanks again for the reply. What I want to do is measure the R2 for the model that ends up eventually getting fit. Do you mind explaining what is happening in your code? It seems like it calculates R2 for each iteration, but still with one lambda? Do you mind explaining why this works in getting an R2? And is this essentially the R2 for the model with the lambda that eventually gets chosen? – Dave Jun 08 '18 at 21:23

How to calculate the cross-validated R2 on a LASSO regression?

1 Answers1