So here is the scenario:-
I ran the following code which creates neural network (uses neuralnet package in R) to approximate the function f(x)=x^2:-
set.seed(2016);
rm(list=ls());
# Prepare Training Data
attribute<-as.data.frame(sample(seq(-2,2,length=50), 50 , replace=FALSE ),ncol=1);
response<-attribute^2;
data <- cbind(attribute,response);
colnames(data)<- c("attribute","response");
# Create DNN
fit<-neuralnet(response~attribute,data=data,hidden = c(3,3),threshold=0.01);
fit$result.matrix;
This worked fine and converged in 3191 steps. Now I made a small change to the code -- I change the function being approximated. Instead of a quadratic function I used a very simple linear function f(x)=2x
. This worked fine too, then I tweaked the coefficient of x and conducted multiple runs e.g.
f(x) = 2x
f(x) = 3x
.
.
f(x) = 19x
Up to this point it worked fine. But one thing I noticed is that number of steps required to converged were dramatically increasing from 2x to 19x. The number of steps for 19x for example are astonishing 84099. This is weird that the DNN is taking so many steps to converge only for a linear function whereas for the quadratic function f(x)=x^2 it only took 3191 steps.
So when I changed the function to f(x)=20x , it probably needs more steps and so I got the following warning:-
> set.seed(2016);
> rm(list=ls());
> # Prepare Training Data
> attribute<-as.data.frame(sample(seq(-2,2,length=50), 50 , replace=FALSE ),ncol=1);
> response<-attribute*20;
> data <- cbind(attribute,response);
> colnames(data)<- c("attribute","response");
>
> # Create DNN
> fit<-neuralnet(response~attribute,data=data,hidden = c(3,3),threshold=0.01);
Warning message:
algorithm did not converge in 1 of 1 repetition(s) within the stepmax
> fit$result.matrix;
So I guess I can tweak the default stepmax parameter and increase the steps. But the real question is -- why should it need so many steps just for a simple linear function like this?