-1

I am trying to implement Neural Networks for classifcation having 5 hidden layers, and with softmax cross entropy in the output layer. The implementation is in JAVA.

For optimization, I have used MiniBatch gradient descent(Batch size=100, learning rate = 0.01)

However, after a couple of iterations, the weights become "NaN" and the predicted values turn out to be the same for every testcase.

Unable to debug the source of this error. Here is the github link to the code(with the test/training file.) https://github.com/ahana204/NeuralNetworks

204
  • 433
  • 1
  • 5
  • 19

2 Answers2

1

In my case, i forgot to normalize the training data (by subtracting mean). This was causing the denominator of my softmax equation to be 0. Hope this helps.

Mohit Chawla
  • 181
  • 7
0

Assuming the code you implemented is correct, one reason would be large learning rate. If learning rate is large, weights may not converge and may become very small or very large which could be shown NaN. Try to lower learning rate to see if anything changes.

Seljuk Gulcan
  • 1,826
  • 13
  • 24