I want to train a neural network with these information:
input : a logical image -> 115*115 + 1 nodes #+1 for bias
output: detected alphabet -> 24 nodes
hidden: i think -> 6 nodes
activation function -> tanh()
At the begin, I initialize weights with random numbers between [-1, 1].
After training the NN, I got same results for every sample.
I found the problem: Because of many inputs, at the first layer numbers are huge and when I apply tanh() on them, outputs are 1, -1 only.
I tested these 2 solution but couldn't help me:
- Scale weights to [-0.1, 0.1]
- Expand tanh(x) function to tanh(x/N) with N=1000, 10000
- Scale the input datas from [0, 1] to [-1, 1]
Any tips or your experience may be useful :)
Extra:
I want to used my own code, and for test I used this function (same as Activate( ) function):
def test(self,Inp):
Inp = np.reshape(Inp,(self.m,self.i))
A1 = Inp.dot(self.W1.transpose())
Z1 = np.tanh(A1)
A2 = Z1.dot(self.W2.transpose())
Y = np.tanh(A2)
return Y
Consider we have random weights, Why output of test function is the same always ?