Getting different predicted values of time series every time I re-train a neural network on R

Question

I'm trying to fit some data with the package nnet on R. After I train the neural network, I want to predict some values, but if I re-train the net and predict again, I get significant different values.

Here's a reproducible code to copy/paste and see what I'm talking about.

# loading required package nnet

if(!require(nnet)){
  install.packages("nnet")
  library(nnet)
}

# reading data

data <- "year  GDP  n.households    GDP.norm    n.households.norm
1950    300.2   48902   -0.959913402290733  -1.64747365536208
1951    347.3   49673   -0.950771933085093  -1.61347968613569
1952    367.7   50474   -0.946812570626599  -1.57816299437132
1953    389.7   51435   -0.942542669936066  -1.53579178240432
1954    391.1   52799   -0.942270948983032  -1.47565199767698
1955    426.2   53557   -0.935458516517682  -1.44223120821706
1956    450.1   54764   -0.930819851676604  -1.38901367143853
1957    474.9   55270   -0.926006509080003  -1.36670375129774
1958    482 56149   -0.924628495675331  -1.32794798093459
1959    522.5   57436   -0.91676799667685   -1.27120318405475
1960    543.3   58406   -0.912730999660347  -1.22843515532636
1961    563.3   59236   -0.908849271759863  -1.19183983177526
1962    605.1   60813   -0.90073646044785   -1.12230871702817
1963    638.6   62214   -0.894234566214539  -1.06053757450397
1964    685.8   63401   -0.885073688369396  -1.00820185275077
1965    743.7   64778   -0.873836086097494  -0.947488888256956
1966    815 66676   -0.859997726132268  -0.863804642353359
1967    861.7   68251   -0.850933891484637  -0.794361709108803
1968    942.5   69859   -0.835251710766681  -0.723463781072457
1969    1019.9  71120   -0.820229423791807  -0.667865343725547
1970    1075.9  72867   -0.80936058567045   -0.590838801263173
1971    1167.8  74142   -0.791524045967725  -0.534623093398533
1972    1282.4  76030   -0.76928174509795   -0.451379755007599
1973    1428.5  77330   -0.740925722784913  -0.394061778361299
1974    1548.8  79108   -0.7175771294635    -0.315668422609668
1975    1688.9  80776   -0.690385625520608  -0.242125049497339
1976    1877.6  82368   -0.653761522779538  -0.171932573481255
1977    2086    83527   -0.613313918056492  -0.120831392763515
1978    2356.6  83918   -0.56079413956294   -0.103591909018359
1979    2632.1  85407   -0.507323337733769  -0.0379407803827123
1980    2862.5  85290   -0.46260583232019   -0.0430993982808793
1981    3210.9  86789   -0.394986132293754  0.0229926378674309
1982    3345    88458   -0.368959146721007  0.0965801017310265
1983    3638.1  89479   -0.31207242433941   0.141596758774005
1984    4040.7  91066   -0.233933241702662  0.211568781033757
1985    4346.7  91124   -0.174542804825252  0.214126044607207
1986    4590.1  92830   -0.127302176276358  0.289344866267659
1987    4870.2  93347   -0.0729385770300762 0.31213978467238
1988    5252.6  94312   0.00128006042718324 0.354687359644441
1989    5657.7  95669   0.0799044590514921  0.414518509112925
1990    5979.6  96391   0.142380869609787   0.446352031527254
1991    6174    96426   0.180111264802494   0.447895207821578
1992    6539.3  97107   0.251011024904839   0.477921009433985
1993    6878.7  98990   0.316883947376057   0.560943894068587
1994    7308.8  99627   0.400360505875972   0.589029702625274
1995    7664.1  101018  0.469319402028075   0.650359937636815
1996    7664.1  102528  0.469319402028075   0.716936972049055
1997    8608.5  103874  0.652614593488942   0.776283123253609
1998    9089.2  104705  0.745911923577082   0.812922537555974
1999    9660.6  108209  0.856812889693918   0.967416529993385
2000    10284.8 NA  0.977961617468032   NA
2001    10621.8 NA  1.04336873259119    NA
2002    10977.5 NA  1.1124052633013 NA
2003    11510.7 NA  1.21589212912822    NA
2004    12274.9 NA  1.36421295220572    NA
2005    13093.7 NA  1.52313089245155    NA
2006    13855.9 NA  1.671063542739  NA
2007    14477.6 NA  1.79172705452556    NA
2008    14718.6 NA  1.83850187572639    NA
2009    14418.6 NA  1.78027595721913    NA
2010    14964.4 NA  1.88620831162334    NA
2011    15517.9 NA  1.99363513126925    NA
2012    16163.2 NA  2.11887908197837    NA
2013    16768.1 NA  2.23628194232852    NA"

df <- read.table(text=data, header=TRUE)

# data for training the net 

input <- data.frame(df[1:50, 4])
output <- data.frame(df[1:50, 5])

# data for predicting new values

new.data <- data.frame(df[, 4])

*************************************************************

# training the neural network

net <- nnet(x=input, y=output, size=3, linout=T)

# predicting

fitted <- predict(net, new.data)

# reconverting to have number of households

house.fitted <- sd(df$n.households, na.rm=T) * fitted + mean(df$n.households, na.rm=T)

# plot of real values against predicted values

plot(df$n.households)
lines(house.fitted, col="blue")

If you re-run the code below the line of asterisks, you can see how the predicted values differ significantly on every run. Here are two plots where you can see what I refer to:

Plot 1

Plot 2

I tried changing the number of hidden neurons and the number of max iterations but I get the same behaviour.

I'm new with neural networks on R, and neural networks in general, so I don't know if I'm missing something on the code, or on the general approach of the problem. I know that ANN can get stuck in local minima, but I don't think they should predict so dissimilar values every time.

Please, let me understand what I'm doing wrong, because this is just one model of the many I'd like to do and I would really like to comprehend ANNs.

score 2 · Accepted Answer · answered Mar 12 '15 at 14:37

As you noted correctly, the net can get stuck in local minima. Due to the random initialization of the weights the final results can differ a lot. One way of minimizing the generalization error is early stopping (i.e. different parameter values for maxit, abstol or reltol). Another way that is supported by nnet is weight decay. With for example decay = 0.001 and maxit = 1000, practically no stopping before convergence, the model gives more stable results already.

To get even more stable results you could consider using the avNNet model from the caret package. It trains a number (repeats) of neural nets and then averages the results. Example:

input <- data.frame(df[1:50, 4])
colnames(input) <- "input"
output <- data.frame(df[1:50, 5])
new.data <- data.frame(df[, 4])
colnames(new.data) <- "input"

library(caret)
myTrainControl <- trainControl(method = "none")
avNNet <- train(y = output$df.1.50..5.,
             x = input,
             tuneGrid = expand.grid(.size = 3,
                                    .decay = 0.001,
                                    .bag = F),
             method = "avNNet", repeats = 15,
             maxit = 1000, linout = T,
             trControl = myTrainControl)
fitted <- predict(avNNet, new.data)
house.fitted <- sd(df$n.households, na.rm=T) * fitted + mean(df$n.households, na.rm=T)
plot(df$n.households)
lines(house.fitted, col="blue")

Thank you for your help. I now get more stable predictions, but when I try to fit a slightly more complicated model the net get "stuck" and don't fit well the data. [Here](http://i.imgur.com/SKn6o17.png?1) are [two](http://i.imgur.com/cXN9Pt5.png?1) examples. I played around with "decay", "size" and "maxit" but it doesn't change anything. Any idea of what is happening now? — Makondo, Mar 12 '15 at 17:01
Does 'stuck' mean bad out of sample fit here? I thought this was an example application to demonstrate out of sample variance. A nnet might not be the best choice for modelling this kind of data (nonstationary time series). Imo a more simple model could be appropriate here or if it has to be a nnet the time series should probably be differenced to make them stationary. If they are cointegrated a VECM might be a good choice. These questions are off topic on SO and should go on cross validated, though. — thie1e, Mar 12 '15 at 17:41
Thank you. Yes, I think ANN might not be the best model here. Anyway, I think you solved my initial question and helped me to understand a little better how to work with ANN on R. — Makondo, Mar 12 '15 at 17:54

Getting different predicted values of time series every time I re-train a neural network on R

1 Answers1