0

I'm trying to understand why my code has taken several days to process and how I can improve the next iteration. I'm on my third day and continue to have outputs with marginal improvements in AIC. The last couple of AIC's have been 18135.38, 18187.43, and 18243.13. I currently have 33 covariates in the model. The "none" option is 12th from the bottom, so there are still many covariates to run.

The data is ~610K observations with ~1600 variables. Outcome variables and covariates are mostly binary. My covariates were chosen after doing univariate logistical regression and P-value adjustment using Holm procedure (alpha=0.05). No interaction terms are included.

The code I've written is here:

intercept_only <- glm(outcome ~ 1, data=data, family="binomial")
full.model <- glm(outcome ~ 157 covariates, data=data, family = "binomial")
forward_step_model <- step(intercept_only, direction = "forward", scope = formula(full.model))

I'm hoping to run the same code on a different outcome variable with double the number of covariates identified in the same way as above but am worried it will take even longer to process. I see there are both the step and stepAIC functions to perform stepwise regression. Is there an appreciable difference between these functions? Are any other ways of doing this? Is there any way to speed up the processing?

thou
  • 35
  • 4
  • Why not ridge/lasso/elasticnet with `glmnet`? It will be **much** faster and probably work better too ... – Ben Bolker Apr 07 '22 at 01:57
  • The thinking was that stepwise would be more intuitive for explaining what we're doing. Also, I was told that ridge/lasso/elasticnet are uncommon ways of doing things in the field, so I didn't want to raise too many eyebrows when publishing comes around – thou Apr 07 '22 at 13:26
  • since stepwise is inferior to many other approaches I don't know how much effort people will have put into making it efficient. You could try https://cran.r-project.org/web/packages/bigstep/vignettes/bigstep.html ? – Ben Bolker Apr 07 '22 at 13:41
  • Thank you, Ben! I found some time to chat with some faculty, and there was more willingness to move towards lasso. I'll post a question if I find that challenging! I'll take a look at bigstep as well. – thou Apr 07 '22 at 17:39
  • On a side note, is there a stack overflow way of dealing with this type of question? (e.g. not a good question/without a clear answer) – thou Apr 07 '22 at 17:41
  • I don't think it's a *terrible* question. You could delete it if you don't think it's useful to future readers/likely to be answered. Or it could hang around and *maybe* attract an answer, or attract more close votes and get closed/deleted eventually. – Ben Bolker Apr 07 '22 at 20:08
  • Hi @BenBolker, I'm not sure if you can help, but I posted a question about LASSO regarding an issue I'm having. – thou Apr 20 '22 at 15:47

0 Answers0