I am trying to implement Neural Networks for classifcation having 5 hidden layers, and with softmax cross entropy in the output layer. The implementation is in JAVA.
For optimization, I have used MiniBatch gradient descent(Batch size=100, learning rate = 0.01)
However, after a couple of iterations, the weights become "NaN" and the predicted values turn out to be the same for every testcase.
Unable to debug the source of this error. Here is the github link to the code(with the test/training file.) https://github.com/ahana204/NeuralNetworks