Efficiently use predict.glm with multiple models

Question

I am weighing the efficacy of using one monolithic model, versus splitting out in to two different models (a split model) on about 100,000 rows of data. To do so, I am getting results from my split model like so:

preds <- numeric(nrow(DF))
for (i in 1:nrow(DF))
{
  if (DF[i,]$col == condition)
  {
    preds[i] <- predict(glm1, DF[i,])
  }
  else
  {
    preds[i] <- predict(glm2, DF[i,])
  }
}

For whatever reason, this seems to be going extremely slow, especially when compared to just getting press for an entire data frame like so:

preds <- predict(glm1,DF)

Do you have any ideas on how I can optimize the first snippet?

I'm not at all surprised it's slow. Seems as though you could get this with two 'predict' calls by using an appropriate pair of 'newdata' arguments. — IRTFM, Feb 04 '15 at 05:37
As I mentioned in another comment, I need to preserve the ordering to be the same as that of the data frame so that I can do things like examine the ROC. — user1775655, Feb 04 '15 at 16:20

IRTFM · Answer 1 · 2015-02-05T05:24:40.933

preds1 <- predict(glm1, DF[DF$col == condition, ])
preds2 <- predict(glm2, DF[DF$col != condition,])

If you want them in the save vector just use c().

If you want to build a dataframe with actual and predicted values stratified by condition then first make a structure that holds the 'actual' and cond variables, some of which at the moment are not named or attributed to any particular structure, so I will assume they are in a dataframe named DF with the column name "actual":

 compare.df <- data.frame(act=DF$actual, cond =DF$col, pred = NA)
 compare.df[DF$col==condition, 'pred'] <- 
        predict(glm1, DF[DF$col == condition, ])
 compare.df[DF$col !=condition, 'pred'] <- 
        predict(glm2, DF[DF$col != condition, ])

The main issue here is that If I want to compare the predicted values to the actual values, I've now lost the original ordering of DM. — user1775655, Feb 04 '15 at 13:50

Efficiently use predict.glm with multiple models

1 Answers1