k-fold cross validation of prediction error using mgcv

Asked Aug 28 '14 at 20:05

Active Aug 28 '14 at 20:22

Viewed 1,027 times

I would like to evaluate the performance of a GAM at predicting novel data using a five-fold cross-validation. Model training is based on a random subset of 80% of the data and the test set the remaining 20%. I can calculate mean square prediction error between the training and test data, but am uncertain how to implement this across k-folds. I have the following code for training and test datasets and to calculate MSPE. I have not included sample data, but can do so.

indexes<-sample(1:nrow(data),size=0.2*nrow(data))
testP<-data[indexes,] #20%
trainP<-data[-indexes,]#80%
gam0<-gam(x~ NULL,family=quasibinomial(link='logit'),data=data,gamma=1.4)
pv<-predict(gam0,newdata=testP,type="response")
diff<-pv-testP$x #(predicted - observed)
diff2<-diff^2 #(predicted - observed)^2
mspegam0<-mean(diff2)

edited Aug 28 '14 at 20:22

Gavin Simpson

170,508
25
396
453

asked Aug 28 '14 at 20:05

akbreezo

k-fold cross validation of prediction error using mgcv

0 Answers0