0

I am trying to apply Gradient Boosting to the MNIST dataset. This is my code:

library(dplyr)
library(caret)
mnist <- snedata::download_mnist()
mnist_num <- as.data.frame(lapply(mnist[1:10000,], as.numeric)) %>%
mutate(id = row_number())
mnist_num <- mnist_num[,sapply(mnist_num, function(x){max(x) - min(x) > 0})]

mnist_train <- sample_frac(mnist_num, .70)
mnist_test <- anti_join(mnist_num, mnist_train, by = 'id')

set.seed(5000)
library(gbm)
boost_mnist<-gbm(Label~ .,data=mnist_train, distribution="bernoulli", n.trees=70, 
interaction.depth=4, shrinkage=0.3)

It shows the following error:

"Error in gbm.fit(x = x, y = y, offset = offset, distribution = distribution, : Bernoulli requires the response to be in {0,1}"

What is wrong here? Can anyone show me the code to correctly do it?

1 Answers1

0

The error

Error in gbm.fit(x = x, y = y, offset = offset, distribution = distribution, : Bernoulli requires the response to be in {0,1}

is due to the choice of the distribution, you should choose the multinomial instead of the bernoulli, because the bernoulli distribution only works with dichotomous response and the mnist label goes from 1 to 10.

Alessio
  • 85
  • 10