1

I'm trying lasagne and the nolearn NeuralNet function to approximate a simple sin function. After all, neural nets are proven to be universal approximators, so I wanted to try lasagne on a simple non-linear function to show that fact experimentally. This is the code:

import lasagne
import numpy as np
from lasagne import layers
from lasagne.updates import nesterov_momentum
from nolearn.lasagne import NeuralNet
import matplotlib.pylab as pylab

x=np.linspace(0,1,1000)
y=np.sin(8*x)

# Fit the dimensions and scale
x=x.reshape(1000,1).astype(np.float32)
y=y.astype(np.float32)
y=(y-np.min(y))/np.max(y)

We get the following function:

pylab.plot(x,y)
pylab.show()

Scaled sin function

Now we create a simple neural net with 100 hidden units to approximate the function:

net= NeuralNet(
layers=[
    ('input', layers.InputLayer),
    ('hidden', layers.DenseLayer),
    ('output', layers.DenseLayer),
],

input_shape=(None,1),

hidden_num_units=100,
hidden_nonlinearity=lasagne.nonlinearities.rectify,
hidden_W=lasagne.init.GlorotUniform(),

output_num_units=1,
output_nonlinearity=None,
output_W=lasagne.init.GlorotUniform(),

update=nesterov_momentum,
update_learning_rate=0.001,
update_momentum=0.9,

regression=True,
max_epochs=500,
verbose=0,
)

net=net.fit(x,y)

Now we predict the x values with the trained net to see what we get:

yp=net.predict(x)
pylab.plot(x,yp,hold=1)
pylab.plot(x,y)
pylab.show()

And this is what we get! Approximated function. It's ridiculous! And if we increment the number of hidden neurons or the training epochs nothing changes. Other types of nonlinearities just make it worse. In theory this should work much better, what am I missing?

Thank you very much.

Julien
  • 13,986
  • 5
  • 29
  • 53
Jorge del Val
  • 111
  • 1
  • 3
  • There are tons of related question already out there on SO. Have you read through them already? – Julien Aug 18 '16 at 08:38
  • Approximating simple sin with neural net seems not to be a good idea, because it's analytic function, for which you could build Taylor or even faster converging series. Neural networks are good at approximating data of unknown shape where even regressional approximation won't work good – thodnev Aug 18 '16 at 08:41
  • @JulienBernu There are tons of questions related to the use of lasagne to large scale problems such as digit recognition. I have successfully implemented neural nets on similar problems with very good results, but somehow this doesn't seem to work on this simple problem, and there are not questions to my knowledge that solve this issue. – Jorge del Val Aug 18 '16 at 08:52
  • @thodnev Of course, and fourier series would be perfect indeed. My objective is not to approximate this function with another basis of functions, but to show the power of neural nets on approximating general functions of any shape. If this can not work in this simple example, how could it work on a complex one? – Jorge del Val Aug 18 '16 at 08:56
  • @JorgedelVal Yes there are: /questions/1565115/approximating-function-with-neural-network /questions/13897316/approximating-the-sine-function-with-a-neural-network for example and many more... – Julien Aug 18 '16 at 09:15
  • @JulienBernu There are questions related to neural networks and function approximation, but not in the python/lasagne/nolearn environment. This one is the closest, but it is not answered (and doesn't use nolearn) /questions/34544320/lasagne-neural-network-does-not-converge-when-modeling-sine . I could go and implement a NN by myself, but my question is specifically about the lasagne environment since it's the one that I use for larger problems. I don't know if it's there's a problem with the environment, how I'm using it, or deeper concepts. – Jorge del Val Aug 18 '16 at 10:27
  • @JorgedelVal I've been using lasagne and nolearn with no trouble on real-world problems, I doubt your issue is in the framework... it is more likely in your network architecture and so on... – Julien Aug 18 '16 at 14:05

1 Answers1

2

I finally know what was happening. I post my guess in case anyone comes by the same problem.

As it is known, NeuralNet from nolearn environment uses batch training. I don't exactly know how it chooses the batches, but it seems to me that it chooses them sequentially. Then, if the data is not randomized, a batch would not be statistically representative of the whole (the data is not stationary). In my case, I made x=np.linspace(0,1,1000), and thus the statistical properties of each batch would be different because there's a natural order.

If you create the data randomly instead, i.e., x=np.random.uniform(size=[1000,1]), each batch would be statistically representative, independently of where it is taken from. Once you do this, you can increase the epoch of training and improve convergence to the true optima. I don't know if my guess is correct, but at least it worked for me. Nevertheless I will dig more into it.

Jorge del Val
  • 111
  • 1
  • 3
  • The default `BatchIterator` has a way to automatically shuffle the dataset with each epoch. Pass something like this when you instantiate the `NeuralNet`: `from nolearn.lasagne import BatchIterator; NeuralNet(batch_iterator_train=BatchIterator(batch_size=128, shuffle=True))` – Daniel Nouri Aug 28 '16 at 02:17