if (any(const_vars)) missing value where TRUE/FALSE needed error while running Lasso in R

Question

When I tried to run a Lasso Regression on my 51th response variable with the other 50 variables, I got the following error message:

lasso_now=cv.glmnet(x=as.matrix(scaledData[,-51]),y=as.matrix(scaledData[,51]),alpha=1,nfolds = 5,type.measure="mse",family = binomial(link = "logit"))

Error in if (any(const_vars)) { : missing value where TRUE/FALSE needed

My response variable is either 0 or 1 so I used logistic regression. My x has either categorical or numerical variables.

Does anyone why it happened or is there any way to validate the data for the issue? Thanks in advance!

score 1 · Answer 1 · answered Nov 18 '21 at 08:28

Check if you have NA values, you get the error because glmnet checks where any of your columns have standard deviation of zero. For example, we set one 1st entry of fourth column to be NA in the following dataset:

library(glmnet)

scaledData = data.frame(v1 = rnorm(100),v2=rnorm(100),
v3 = rbinom(100,1,0.5),v4 = rbinom(100,1,0.7))

scaledData[1,4] = NA

You can check:

glmnet:::weighted_mean_sd(as.matrix(scaledData[,-3]))
$mean
        v1         v2         v4 
0.03979154 0.14547529         NA 

$sd
       v1        v2        v4 
0.8544635 1.0815797        NA

Runs with the same error:

lasso_now=cv.glmnet(x=as.matrix(scaledData[,-3]),
y=as.matrix(scaledData[,3]),
alpha=1,nfolds = 5,type.measure="mse",
family = binomial(link = "logit"))

Error in if (any(const_vars)) { : missing value where TRUE/FALSE needed

One way you can remove is like this:

scaledData = scaledData[complete.cases(scaledData),]

And run it, note that for binomial you should not use "mse", you can use "deviance", "class" or "auc".

lasso_now=cv.glmnet(x=as.matrix(scaledData[,-3]),
y=as.matrix(scaledData[,3]),
alpha=1,nfolds = 5,type.measure="deviance",
family = binomial(link = "logit"))

lasso_now

Call:  cv.glmnet(x = as.matrix(scaledData[, -3]), 
y = as.matrix(scaledData[,3]), 
type.measure = "deviance", nfolds = 5, alpha = 1, 
family = binomial(link = "logit")) 

Measure: GLM Deviance 

     Lambda Index Measure      SE Nonzero
min 0.07643     1   1.427 0.01681       0
1se 0.07643     1   1.427 0.01681       0

Thanks so much, especially for the validation part! – Edison Lin Nov 18 '21 at 14:58 — Edison Lin, Nov 18 '21 at 14:58

if (any(const_vars)) missing value where TRUE/FALSE needed error while running Lasso in R

1 Answers1