Since the implementation of the algorithm is correct(i checked it hundreds of times), I think I have misunderstood some theorical facts.
I suppose that:
given that j refers to the hiddenlayer side and k to the output layer
∂E/∂wjk is calculated by doing:
outputNeuron[k].errInfo=(target[k]-outputNeuron[k].out)*derivate_of_sigmoid(outputNeuron[q].in);
∂E/∂wjk=outputNeuron[k].errInfo*hiddenNeuron[j].out;
For the ∂E/∂wij, where 'i' refers to the inputlayer and 'j' to the hiddenlayer, it's a bit longer.
Each hidden unit (Zj, j = 1, ... ,p) sums its
delta inputs (from units in the output layer),
errorInfo_in[j]=summation from k=1 to m(number of output units) of: outputNeuron[k].errInfo*w[j][k]
Then i calculate the error info of the hidden unit:
hiddenNeuron[j].errInfo=errorInfo_in[j]*derivated_sigmoid(hiddenNeuron[j].in);
Finally the ∂E/∂wij is:
hiddenNeuron[j].errInfo*x[i] (where x[i] is the output of an input unit)
I apply the RPROP as described here http://www.inf.fu-berlin.de/lehre/WS06/Musterererkennung/Paper/rprop.pdf
For all the weight between the input and hidden, and hidden output.
I'm trying to recognize letters made of '#' and '-' , 9(rows)x7(columns).
The MSE just get stuck at 172 after a few epoch.
I know that RPROP is a batch learning, but i'm using online learning because i read that it works anyway.