-1
library(mboost)
### a simple two-dimensional example: cars data
cars.gb <- gamboost(dist ~ speed, data = cars, dfbase = 4,
                    control = boost_control(mstop = 50))
set.seed(1)
cars_new <- cars + rnorm(nrow(cars))
> predict(cars.gb, newdata = cars_new$speed)
Error in check_newdata(newdata, blg, mf) : 
  ‘newdata’ must contain all predictor variables, which were used to specify the model.

I fit a model using the example on the help(gamboost) page. I want to use this model to predict on a new dataset, cars_new, but encountered the above error. How can I fix this?

Adrian
  • 9,229
  • 24
  • 74
  • 132
  • You defined `dist ~ speed` in your `gamboost` function. Shouldn't the prediction be `predict(cars.gb, newdata = cars_new)`? – Martin Gal Sep 08 '21 at 23:45
  • In my experience the `newdata` option only requires the predictor (i.e., speed) and not both the predictor and outcome (speed and dist). – Adrian Sep 08 '21 at 23:47
  • In this case, try `cars_new["speed"]` instead of `cars_new$speed`. The first version keeps the data.frame structure, the latter returns a vector. – Martin Gal Sep 08 '21 at 23:48

1 Answers1

1

predict function looks for a variable called speed but when you subset it with $ sign it has no name anymore.

so, this variant of prediction works;

predict(cars.gb, newdata = data.frame(speed = cars_new$speed))

or keep the original name as is;

predict(cars.gb, newdata = cars_new['speed'])
Samet Sökel
  • 2,515
  • 6
  • 21