I'm trying to put together a really simple 3-layer neural network in lasagne. 30 input neurons, 10-neuron hidden layer, 1-neuron output layer. I'm using the binary_crossentropy loss function and sigmoid nonlinearity. I want to put l1 regularization on the edges entering the output layer and l2 regularization on the edges from the input to the hidden layer. I'm using code very close to the example code on the regularization page of the lasagne documentation.
The L1 regularization seems to work fine, but whenever I add the L2 regularization's penalty term to the loss function, it returns nan. Everything works fine when I remove the term l2_penalty * l2_reg_param
from the last line below. Additionally, I'm able to perform L1 regularization on the hidden layer l_hid1
without any issues.
This is my first foray into theano and lasagne so I feel like the error is probably something pretty simple but I just don't know enough to see it.
Here's the net setup code:
l_in = lasagne.layers.InputLayer(shape=(942,1,1,30),input_var=input_var)
l_hid1 = lasagne.layers.DenseLayer(l_in, num_units=10, nonlinearity=lasagne.nonlinearities.sigmoid, W=lasagne.init.GlorotUniform())
network = lasagne.layers.DenseLayer(l_hid1, num_units=1, nonlinearity=lasagne.nonlinearities.sigmoid)
prediction = lasagne.layers.get_output(network)
l2_penalty = regularize_layer_params(l_hid1, l2)
l1_penalty = regularize_layer_params(network, l1)
loss = lasagne.objectives.binary_crossentropy(prediction, target_var)
loss = loss.mean()
loss = loss + l1_penalty * l1_reg_param + l2_penalty * l2_reg_param
Any help would be greatly appreciated. Thanks!!