2

I am running regression on data by group/levels. so a reg for each level, i.e, my code looks very similar to this:

library('dplyr')
regressions <- mtcars %>%
  group_by(cyl) %>%
  do(mod = lm(mpg ~ wt, .))

How do I extract predictions on a smaller newdata set that has same group? i.e the levels are the same but with a smaller sample. I want to do the predictions all at once on a new data sets for all the levels. I know "augment" in broom gives you predictions. but I don't know how to do it for all levels all at once.

The code I used looks like this.

library('broom')
aa <- as.data.frame(augment(regressions, newdata=test))

I also tried

aa <- as.data.frame(augment(regressions, mod, newdata=test))

which gave fitted values but was duplicating my newdata to match the original one but the predictions for each observation were different. My data has 44 levels and I have to do more. Thanks a bunch for the help.

rawr
  • 20,481
  • 4
  • 44
  • 78
Alice Work
  • 185
  • 1
  • 2
  • 10
  • do you want `regressions %>% augment(mod, newdata=test)` ? – rawr Apr 28 '16 at 19:15
  • Yes, I tried that but it is duplicating my data to have the same dimensions as the original dataset but the predictions for all the duplicates are different. i.e x(With duplicated) predictions 10 5 10 15 10 9 IS there a way yo specify such that it doesn't do that? I want Just the predicted values for the new data. So technically the dimensions of my results should also be the same as that right. – Alice Work Apr 28 '16 at 22:56
  • the result should be the size of your test data set times 3, right? you have three linear model objects (`regressions`) and each of those is being used to `predict.lm` on the test data. if you only want one, `predict(lm(mpg ~ wt, mtcars), newdata = test)` should work. if you do that three times with mtcars subset by cyl, you should get the same results as when using augment and grouping by cyl--it just creates some subsets and runs predict on each – rawr Apr 29 '16 at 01:53
  • Thanks for all your help. I have to do close to 500 predictions using a newdata set, I am trying to find a way to do them all at once. The example above was just to illustrate what I need to do. The problem I need to solve is how do I avoid running 500 regressions and predictions. I know how to do the regressions by group now I want to be able to predict in groups. My test data have the same level as my training dataset. – Alice Work May 11 '16 at 15:28
  • I'd like to figure out how to do that, too. I've got n subjects, data grouped by two sets of levels, and newdata that also is grouped by two sets of leveles, and all of those levels can take on two values. My code gets answers that are roughly four times as big as they should be (4 = 2 x 2). I'd love to be able to do the equivalent of a full outer join between the models and the newdata. – Bill Oct 03 '16 at 02:41

0 Answers0