I use R to fit a lot of GLMs on medium-large data sets. Typically 500k-1M rows and up to 50 factors in my models (prior to simplification - banding or dropping factors that aren't predictive, etc.
Base R's glm()
doesn't seem to cope well with this size of problem. I can and do use revoScaleR::rxGlm()
instead, which is much better in this respect, but this has its own problems (patchy documentation, unable to use other R functions designed to work with glm
objects, etc.).
Are there any alternatives that I'm not aware of? What's currently the preferred glm package for this sort of thing?
(I do need to stick to the GLM framework for the moment - I may at some point branch out into other modelling techniques - of which there are plenty of course - but that's one for later on...)
Thanks.