I'm training fully connected neural network to classify MNIST dataset. Index of the most saturated neuron in the output layer defines network output (digit from 0 to 9).
I would like to use tanh()
activation function (just for learning purposes).
What is the correct way to represent image label as a vector (for generating errors vector which will be backpropagated)?
For sigmoid()
activator this vector could be vector of zeros with only 1
in the position of the classified digit. Does that mean that for tanh()
it should be vector of -1
s instead of 0
s (based on range of the function)? What is the general guidance?