I'm trying to run a simple GBM classification model to benchmark performance against random forests and SVMs, but I'm having trouble getting the model to score correctly. It's not throwing an error, but the predictions are all NaN. I'm using the breast cancer data from mlbench
. Here's the code:
library(gbm)
library(mlbench)
library(caret)
library(plyr)
library(ada)
library(randomForest)
data(BreastCancer)
bc <- BreastCancer
rm(BreastCancer)
bc$Id <- NULL
bc$Class <- as.factor(mapvalues(bc$Class, c("benign", "malignant"), c("0","1")))
index <- createDataPartition(bc$Class, p = 0.7, list = FALSE)
bc.train <- bc[index, ]
bc.test <- bc[-index, ]
model.gbm <- gbm(Class ~ ., data = bc.train, n.trees = 500)
pred.gbm <- predict(model.gbm, bc.test.ind, n.trees = 500, type = "response")
Can anyone help out with what I'm doing wrong? Also, am I going to have to transform the output of the predict function? I've read that that seems to be an issue with GBM predictions. Thanks.