The nnet
method you've specified is using iterative optimisation (BFGS method from optim()
function in base R) to estimate parameters for the model [1]. The optimisation should stop when it converges. If maxit
is set too low then the model will fail to converge.
The BFGS method is not guaranteed to converge for all optimisation problems. Nonetheless it is regarded as a good optimisation method. The optimisation surface is data dependent so I won't comment on the number or nature of minima for your case. You may have hit a local minima at 300 iterations but there is some stochasticity in the nnet()
function (setting random weights) so subsequent runs may differ even if all nnet()
parameters are identical. Note the difference between the two subsequent nnet()
runs with identical parameters - 4.115351 versus 2.112400 at 100 iterations.
library(nnet)
data(iris)
set.seed(42)
nnet(Species ~ ., data=iris, size=10)
# weights: 83
initial value 262.654300
iter 10 value 72.296066
iter 20 value 10.287034
iter 30 value 6.341659
iter 40 value 5.814649
iter 50 value 5.187836
iter 60 value 4.199448
iter 70 value 4.150082
iter 80 value 4.122058
iter 90 value 4.117969
iter 100 value 4.115351
final value 4.115351
stopped after 100 iterations
a 4-10-3 network with 83 weights
inputs: Sepal.Length Sepal.Width Petal.Length Petal.Width
output(s): Species
options were - softmax modelling
# Deliberately not setting seed value before second nnet run
nnet(Species ~ ., data=iris, size=10)
# weights: 83
initial value 201.869745
iter 10 value 67.631035
iter 20 value 11.863275
iter 30 value 6.542750
iter 40 value 5.758701
iter 50 value 5.355368
iter 60 value 3.970210
iter 70 value 2.835171
iter 80 value 2.414463
iter 90 value 2.226375
iter 100 value 2.112400
final value 2.112400
stopped after 100 iterations
a 4-10-3 network with 83 weights
inputs: Sepal.Length Sepal.Width Petal.Length Petal.Width
output(s): Species
options were - softmax modelling
Also note, that neither of the nnet()
runs above have converged. Here is an example of a converged model:
set.seed(42)
nnet(Species ~ ., data=iris, size=10, maxit=500)
# weights: 83
initial value 262.654300
iter 10 value 72.296066
iter 20 value 10.287034
# I've truncated the output here
iter 360 value 0.000277
iter 370 value 0.000117
final value 0.000097
converged
a 4-10-3 network with 83 weights
inputs: Sepal.Length Sepal.Width Petal.Length Petal.Width
output(s): Species
options were - softmax modelling
Note, "converged" in the output above.
Unfortunately, it's not possible to tune the maxit
parameter using the tune_grid
option to the caret train
function. It's probably reasonable to set a high value for maxit
in the train
call but I won't recommend a value because again it's data-dependent. For the iris data I'd try a value which is an order of magnitude, or two, higher than the largest number of iterations which converged.
Alternatively, you could loop over values for maxit
:
num.it <- 500 # max number of training iterations
fit.dat <- matrix(ncol=1, nrow=num.it) # fitting criterion values
for(i in 1:num.it) {
# to monitor progress
cat(i,'\n')
flush.console()
# to ensure same set of random starting weights are used each time
set.seed(42)
# temporary nnet model
mod.tmp <- nnet(Species ~ ., data=iris, size=10, maxit=i, trace=F)
# append fitting criterion value
fit.dat[i,] <- mod.tmp$value
}
# extract convergence values
which.min(fit.dat)
[1] 375
fit.dat[which.min(fit.dat)]
[1] 9.654717e-05
# plot fitting values
plot(fit.dat, type='l')
The above loop tunes maxit
but doesn't take over-fitting into account. A better approach would be to use the caret train()
function with your current tune_grid
and cross-validation settings. You'd also have to check the caret train()
function output for convergence.
Also, caret and other packages may have surprising reproducibility issues with set.seed(): R: set.seed() results don't match if caret package loaded
Finally, it's unlikely to help but it may be interesting to look at the seeds
option to caret trainControl()
function. As the docs say it's probably only useful when running parallel jobs.
[1] https://cran.r-project.org/web/packages/nnet/nnet.pdf