I have many continuous independent variables and a dependent dummy variable in my data set about individuals in given years. I want to perform feature selection using Logistic Random Effects Lasso/Logistic Fixed Effects Lasso. However, the default settings of glmnet
for my estimation procedure is that I am using cross-sectional data while I want R
to see my data as panel data, and it thus models a Logistic Lasso while I want a Logistic Random Effects Lasso/Logistic Fixed Effects Lasso model.
Therefore, in the example code below, I want to let R
know that I am using a panel data set and that ID
are my individuals/cross-sectional units and year
are the years I have observations for each ID
. In the code below, all individuals are pooled and I even get coefficients for ID
(and year
) in this Logistic Lasso estimation. How can I estimate a Logistic Random Effects Lasso/Logistic Fixed Effects Lasso model in R
?
df=cbind(c(1,546,2,56,6,73,4234,436,647,567,87,2,5,76,5,456,6756,6,132,78,32),c(2,3546,26,568,76,873,234,36,67,57,887,29,50,736,51,56,676,62,32,782,322),10:30)
year=rep(1:3, times=7)
ID=rep(1:7, each=3)
x=as.matrix(cbind(ID,year,df))
y1=as.data.frame(rep(c(0,1), each = 18))[1:21,]
y=as.matrix(y1)
fit=glmnet(x,y,alpha=1,family="binomial")
lambdamin=min(fit$lambda)
predict.glmnet(fit,s=lambdamin,newx=x,type="coefficients")
1
(Intercept) -8.309211e+01
ID 1.281220e+01
year .
-2.339904e-04
.
.