0

I am training a neural network in R using the train function in the caret package. I am using some example code found here:Time-series - data spliting and model evaluation.

The output of the network training tells me that it has been re-scaled to [0,1] but when I used the predict function, my predictions are not scaled to [0,1]. First, how do I know if the data has been normalized properly? And second, how do I get the normalized predictions?

Here is my code:

timeSlices <- createTimeSlices(1:nrow(mytsframe3), initialWindow = 36,
                           horizon = 12, fixedWindow = TRUE)

nn <- train(diffREALBRENTSPOT ~ diffF1REALlag + diffF2REALlag, data = mytsframe3[trainSlices[[1]],], method = "mlp"
        , size = 1, preProc = c("range"))

> nn
Multi-Layer Perceptron 

36 samples
 2 predictor

Pre-processing: re-scaling to [0, 1] (2) 
Resampling: Bootstrapped (25 reps) 
Summary of sample sizes: 36, 36, 36, 36, 36, 36, ... 
Resampling results across tuning parameters:

  size  RMSE       Rsquared 
  1     0.7879697  0.2098693
  3     0.7485212  0.2249331
  5     0.7571630  0.2246444

RMSE was used to select the optimal model using  the smallest value.
The final value used for the model was size = 3. 

pred <- predict(nn, mytsframe3[testSlices[[1]],])

str(pred)
 Named num [1:12] 0.0734 -0.0214 0.3264 0.0362 -0.1569 ...
 - attr(*, "names")= chr [1:12] "37" "38" "39" "40" ...

Here is a dput of my data for testing:

structure(list(diffREALBRENTSPOT = c(-0.523999999999999, -0.693, 
0.386999999999999, 0.453000000000001, -0.842000000000001, 0.369999999999999
), diffF1REALlag = c(0.48597655, -1.61485375, 0.60622805, -0.469351210000001, 
0.292303670000001, -0.44088176), diffF2REALlag = c(1.00948236, 
0.48597655, -1.61485375, 0.60622805, -0.469351210000001, 0.292303670000001
)), .Names = c("diffREALBRENTSPOT", "diffF1REALlag", "diffF2REALlag"
), row.names = c(NA, 6L), class = "data.frame")
Community
  • 1
  • 1
user111417
  • 143
  • 1
  • 9
  • Maybe `predict` defaults to giving you log-odds, and you need to transform them to 0-1? Did you try including `type = "response"` in the call to `predict`? – ulfelder Nov 02 '16 at 19:13
  • 1
    @ulfelder including `type = "response"` produces an error of the form: `Error in predict.train(nn, mytsframe3[testSlices[[1]], ], type = "response") : type must be either "raw" or "prob"` – user111417 Nov 02 '16 at 19:25
  • Well, then, did you try `type = "prob"`? – ulfelder Nov 02 '16 at 19:26
  • @ulfelder, using `type = "prob"` produces `data frame with 0 columns and 12 rows` where the rows are just the row numbers of the timeslices – user111417 Nov 02 '16 at 19:29
  • Can you post data to allow us to replicate your results and test fixes? – ulfelder Nov 02 '16 at 19:35
  • @ulfelder I just made an edit, adding some data so you can replicate. You'll need to adjust some of the parameters in create time slices, and the RMSE's wont be the same. – user111417 Nov 02 '16 at 21:44

1 Answers1

2

The output of the network training tells me that it has been re-scaled to [0,1] but when I used the predict function, my predictions are not scaled to [0,1].

The outcome is numeric and you are fitting a regression model (not classification). The preProc option rescales your predictors to be on [0,1] and does not rescale the outcome or the predictions to be on this range.

topepo
  • 13,534
  • 3
  • 39
  • 52