0

I have installed the mx.net package in R version 3.4.3 using

cran <- getOption("repos")

cran["dmlc"] <- "https://s3-us-west-2.amazonaws.com/apache-mxnet/R/CRAN/"

options(repos = cran)

install.packages("mxnet").

Some issues occur while estimating a neural network.

In underneath code, I used the breastcancer dataset available on https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Original%29.

BC_data <- read.csv("Data_breastcancer.csv", sep = ";")

# generate a train and test set
trainIndex = sample(1:nrow(BC_data), size = round(0.8*nrow(BC_data)), replace=FALSE)
train_data <- BC_data[trainIndex,]
test_data <- BC_data[-trainIndex,]
X_train <- train_data[,c(-1,-11)]
y_train <- train_data[,11]   

# estimate neural network
model = mx.mlp(as.matrix(X_train), as.numeric(y_train), hidden_node = 10, out_node = 2, out_activation = "softmax", learning.rate = 0.1, num.round = 20)

However, instead of iteratively returning accuracy values, the only output I get is

Start training with 1 devices
Warning message:
In mx.model.select.layout.train(X, y) :
Auto detect layout of input matrix, use rowmajor..

so it appears that the iterative process hasn't started at all.

Does somebody know how to solve this problem?

Esmee
  • 21
  • 2

1 Answers1

0

You need to do few adjustments:

  1. MxNet doesn't really work with arbitrary names of classes. In your case, class labels are "2" and "4". You need to convert them to numbers starting from 0 and higher (see example in the code below)
  2. You need to provide Accuracy as a metric to mlp. There is an error in the documentation: It says that the parameter name is "eval_metric", but actually it is "eval.metric" as in mx.model.FeedForward.create function
  3. Initial dataset has "?" as a NA value. Mlp doesn't work with NA's, so you need to replace them with something. I have chosen 0, but you can use something more domain-specific, if needed.

Here is the code that works on breast-cancer-wisconsin.data file, which I found from your link:

library(mxnet)
BC_data <- read.csv(
  file = "breast-cancer-wisconsin.data", 
  sep = ",", 
  header = FALSE, 
  colClasses = c(rep("numeric", 11)), 
  na.strings = c("", "?") # Few records of the 7th column contain "?" - treat "?" as NA
)

BC_data[is.na(BC_data)] <- 0 # Replace NA with zeroes

# generate a train and test set
trainIndex = sample(1:nrow(BC_data), size = round(0.8*nrow(BC_data)), replace=FALSE)
train_data <- BC_data[trainIndex,]
test_data <- BC_data[-trainIndex,]
X_train <- train_data[,c(-1,-11)]
y_train <- train_data[,11]   

# estimate neural network
model = mx.mlp(
    data = as.matrix(X_train), 
    label = as.numeric(ifelse(y_train == 2, 0, 1)), # Replace classes with 0 and 1
    hidden_node = 10, 
    out_node = 2, 
    out_activation = "softmax", 
    learning.rate = 0.1, 
    num.round = 20,
    array.layout = "rowmajor", # get rid of a nasty warning
    eval.metric=mx.metric.accuracy # set Accuracy as a metric
  )

If I run this code, I get the following output:

Start training with 1 devices
[1] Train-accuracy=0.64453125
[2] Train-accuracy=0.6515625
[3] Train-accuracy=0.65625
[4] Train-accuracy=0.9
[5] Train-accuracy=0.95625
[6] Train-accuracy=0.95
[7] Train-accuracy=0.94375
[8] Train-accuracy=0.9328125
[9] Train-accuracy=0.93125
[10] Train-accuracy=0.9328125
[11] Train-accuracy=0.9375
[12] Train-accuracy=0.9390625
[13] Train-accuracy=0.9484375
[14] Train-accuracy=0.95
[15] Train-accuracy=0.9453125
[16] Train-accuracy=0.946875
[17] Train-accuracy=0.9484375
[18] Train-accuracy=0.95
[19] Train-accuracy=0.9484375
[20] Train-accuracy=0.9515625
Sergei
  • 1,617
  • 15
  • 31
  • Do you also know if it is possible to generate a plot of the neural network just as is possible in the neuralnet package using the general plot() function? – Esmee Jan 30 '18 at 09:48
  • The only way I know how to visualize the NN is to use mxnet package function: graph.viz(model$symbol). I don't think it is possible to visualize it with the general plot(). – Sergei Jan 31 '18 at 18:53