Classification with gbm() - errors

Question

cancer <- read.csv('breast-cancer-wisconsin.data', header = FALSE, na.strings="?")
cancer <- cancer[complete.cases(cancer),]
names(cancer)[11] <- "class"
cancer[, 11] <- factor(cancer[, 11], labels = c("benign", "malignant"))
library(gbm)

Firstly, I remove 'NA' values using complete.cases and make the eleventh column, the "class", as factor. I want to use "class" as the response variable and other columns, except the first one, as predictor variables.

On my first attempt, I typed in:

boost.cancer <- gbm(class ~ .-V1, data = cancer, distribution = "bernoulli") 

Error in gbm.fit(x, y, offset = offset, distribution = distribution, w = w,  : 
Bernoulli requires the response to be in {0,1}

Then, I use the contrasts of the class instead of class.

boost.cancer <- gbm(contrasts(class) ~ .-V1, distribution = "bernoulli", data = cancer)

Error in model.frame.default(formula = contrasts(class) ~ . - V1, data = cancer,  : 
variable lengths differ (found for 'V1')

How do I correct these errors? I'm sure there is something wrong with my method.

Julián Urbano · Accepted Answer · 2014-06-02T13:36:56.730

As the error says, your response is not in [0,1]. You can do this instead of creating the factor:

> cancer$class <- (cancer$class -2)/2

> boost.cancer <- gbm(class ~ .-V1, data = cancer, distribution = "bernoulli")
> boost.cancer
gbm(formula = class ~ . - V1, distribution = "bernoulli", data = cancer)
A gradient boosted model with bernoulli loss function.
100 iterations were performed.
There were 9 predictors of which 4 had non-zero influence.

score 0 · Answer 2 · answered Apr 09 '17 at 10:40

0

You can also use:

boost.cancer <- gbm((unclass(class)-1) ~ .-V1, data = cancer, distribution = "bernoulli") summary(boost.cancer)

Do the similar thing while "predict" function and determining Confusion Matrix with Accuracy.

answered Apr 09 '17 at 10:40

wackyanil

21
3

1

Please format your posts using Markdown or HTML. See https://stackoverflow.com/help/formatting for more information. – 0xJoKe Apr 09 '17 at 11:07
Thanks. I was unaware of that. – wackyanil Apr 10 '17 at 14:07

Classification with gbm() - errors

2 Answers2