2

I have been playing with Lasagne for a while now for a binary classification problem using a Convolutional Neural Network. However, although I get okay(ish) results for training and validation loss, my validation and test accuracy is always constant (the network always predicts the same class).

I have come across this, someone who has had the same problem as me with Lasagne. Their solution was to setregression=True as they are using Nolearn on top of Lasagne.

Does anyone know how to set this same variable within Lasagne (as I do not want to use Nolearn)? Further to this, does anyone have an explanation as to why this needs to happen?

Djizeus
  • 4,161
  • 1
  • 24
  • 42
mjacuse
  • 79
  • 1
  • 5
  • Is there a particular reason why you don't want to use nolearn ? – P. Camilleri Sep 22 '15 at 12:15
  • No particular reason, other than it just seems to add another layer on top of lasagne which Im not sure is necessary. Do you think it adds any more functionality? Also I feel as though it may be harder to debug? – mjacuse Sep 23 '15 at 13:18
  • IMO nolearn adds nice functionalities, such as BatchIterator which I use a lot for preprocessing (randomly crop data, etc). I have not found debugging harder since I started using nolearn. But that's a personal point-of-view. – P. Camilleri Sep 24 '15 at 14:06

1 Answers1

0

Looking at the code of the NeuralNet class from nolearn, it looks like the parameter regression is used in various places, but most of the times it affects how the output value and loss are computed.

In case of regression=False (the default), the network outputs the class with the maximum probability, and computes the loss with the categorical crossentropy. On the other hand, in case of regression=True, the network outputs the probabilities of each class, and computes the loss with the squared error on the output vector.

I am not an expert in deep learning and CNN, but the reason this may have worked is that in case of regression=True and if there is a small error gradient, applying small changes to the network parameters may not change the predicted class and the associated loss, and may lead the algorithm to "think" that it has converged. But if instead you look at the class probabilities, small parameter changes will affect the probabilities and the resulting mean squared error, and the network will continue down this path which may eventually change the predictions.

This is just a guess, it is hard to tell without seeing the code and the dataset.

Djizeus
  • 4,161
  • 1
  • 24
  • 42