My question is about the typical feed-forward single-hidden-layer backprop neural network, as implemented in package nnet, and trained with train()
in package caret. This is related to this question but in the context of the nnet and caret packages in R.
I demonstrate the problem with a simple regression example where Y = sin(X) + small error
:
raw Y ~ raw X:
predicted outputs are uniformly zero where rawY < 0
.
scaled Y (to 0-1) ~ raw X
: solution looks great; see code below.
The code is as follows
library(nnet)
X <- t(t(runif(200, -pi, pi)))
Y <- t(t(sin(X))) # Y ~ sin(X)
Y <- Y + rnorm(200, 0, .05) # Add a little noise
Y_01 <- (Y - min(Y))/diff(range(Y)) # Y linearly transformed to have range 0-1.
plot(X,Y)
plot(X, Y_01)
dat <- data.frame(cbind(X, Y, Y_01)); names(dat) <- c("X", "Y", "Y_01")
head(dat)
plot(dat)
nnfit1 <- nnet(formula = Y ~ X, data = dat, maxit = 2000, size = 8, decay = 1e-4)
nnpred1 <- predict(nnfit1, dat)
plot(X, nnpred1)
nnfit2 <- nnet(formula = Y_01 ~ X, data = dat, maxit = 2000, size = 8, decay = 1e-4)
nnpred2 <- predict(nnfit2, dat)
plot(X, nnpred2)
When using train()
in caret, there is a preProcess option but it only scales the inputs. train(..., method = "nnet", ...)
appears to be using the raw Y
values; see code below.
library(caret)
ctrl <- trainControl(method = "cv", number = 10)
nnet_grid <- expand.grid(.decay = 10^seq(-4, -1, 1), .size = c(8))
nnfit3 <- train(Y ~ X, dat, method = "nnet", maxit = 2000,
trControl = ctrl, tuneGrid = nnet_grid, preProcess = "range")
nnfit3
nnpred3 <- predict(nnfit3, dat)
plot(X, nnpred3)
Of course, I could linearly transform the Y
variable(s) to have a positive range, but then my predictions will be on the wrong scale. Though this is only a minor headache, I'm wondering if there is a better solution for training nnet or avNNet models with caret when the output has negative values.