I am trying to run logistic regression by using LASSO in glmnet package. And I need to force the model to include certain parameters. However, I got an error.
> cv.lasso = cv.glmnet(x,y,family="binomial",alpha = 1,penalty.factor = penalty)
Error: Matrices must have same number of columns in rbind2(.Call(dense_to_Csparse, x), y)
In addition: Warning messages:
1: from glmnet Fortran code (error code -1); Convergence for 1th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned
2: In getcoef(fit, nvars, nx, vnames) :
an empty model has been returned; probably a convergence issue
x has 95 variables that are all binary (0 or 1). I have to force 3 variables to be included so I set their penalty.factor = 0.
> penalty
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[75] 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1
if I remove penalty.factor
, it will work but I have to force those three variables to be included. However, when I keep penalty.factor
and remove family = "binomial"
, it was running but it is not a binary logistic regression anymore. Does anyone know how to fix it?
Edit: Since I don't have a solution and I am facing pressure to show results ASAP, I choose to use the variables selected by LASSO combined with those three mandatory variables to run a regular logit regression. Somehow I think there will be an issue by doing this...
Thank you!