PreProcessing Scale for predict in R - Not getting same scale for predict

Question

I am training a neural network in R using the train function in the caret package. I am using some example code found here:Time-series - data spliting and model evaluation.

The output of the network training tells me that it has been re-scaled to [0,1] but when I used the predict function, my predictions are not scaled to [0,1]. First, how do I know if the data has been normalized properly? And second, how do I get the normalized predictions?

Here is my code:

timeSlices <- createTimeSlices(1:nrow(mytsframe3), initialWindow = 36,
                           horizon = 12, fixedWindow = TRUE)

nn <- train(diffREALBRENTSPOT ~ diffF1REALlag + diffF2REALlag, data = mytsframe3[trainSlices[[1]],], method = "mlp"
        , size = 1, preProc = c("range"))

> nn
Multi-Layer Perceptron 

36 samples
 2 predictor

Pre-processing: re-scaling to [0, 1] (2) 
Resampling: Bootstrapped (25 reps) 
Summary of sample sizes: 36, 36, 36, 36, 36, 36, ... 
Resampling results across tuning parameters:

  size  RMSE       Rsquared 
  1     0.7879697  0.2098693
  3     0.7485212  0.2249331
  5     0.7571630  0.2246444

RMSE was used to select the optimal model using  the smallest value.
The final value used for the model was size = 3. 

pred <- predict(nn, mytsframe3[testSlices[[1]],])

str(pred)
 Named num [1:12] 0.0734 -0.0214 0.3264 0.0362 -0.1569 ...
 - attr(*, "names")= chr [1:12] "37" "38" "39" "40" ...

Here is a dput of my data for testing:

structure(list(diffREALBRENTSPOT = c(-0.523999999999999, -0.693, 
0.386999999999999, 0.453000000000001, -0.842000000000001, 0.369999999999999
), diffF1REALlag = c(0.48597655, -1.61485375, 0.60622805, -0.469351210000001, 
0.292303670000001, -0.44088176), diffF2REALlag = c(1.00948236, 
0.48597655, -1.61485375, 0.60622805, -0.469351210000001, 0.292303670000001
)), .Names = c("diffREALBRENTSPOT", "diffF1REALlag", "diffF2REALlag"
), row.names = c(NA, 6L), class = "data.frame")

Maybe `predict` defaults to giving you log-odds, and you need to transform them to 0-1? Did you try including `type = "response"` in the call to `predict`? — ulfelder, Nov 02 '16 at 19:13
@ulfelder including `type = "response"` produces an error of the form: `Error in predict.train(nn, mytsframe3[testSlices[[1]], ], type = "response") : type must be either "raw" or "prob"` — user111417, Nov 02 '16 at 19:25
@ulfelder, using `type = "prob"` produces `data frame with 0 columns and 12 rows` where the rows are just the row numbers of the timeslices — user111417, Nov 02 '16 at 19:29
Can you post data to allow us to replicate your results and test fixes? — ulfelder, Nov 02 '16 at 19:35
@ulfelder I just made an edit, adding some data so you can replicate. You'll need to adjust some of the parameters in create time slices, and the RMSE's wont be the same. — user111417, Nov 02 '16 at 21:44

score 2 · Answer 1 · answered Nov 04 '16 at 17:32

The output of the network training tells me that it has been re-scaled to [0,1] but when I used the predict function, my predictions are not scaled to [0,1].

The outcome is numeric and you are fitting a regression model (not classification). The preProc option rescales your predictors to be on [0,1] and does not rescale the outcome or the predictions to be on this range.

PreProcessing Scale for predict in R - Not getting same scale for predict

1 Answers1