Here is the code:
import numpy as np
# sigmoid function
def nonlin(x,deriv=False):
if(deriv==True):
return x*(1-x)
return 1/(1+np.exp(-x))
# input dataset
X = np.array([ [0,0,1],
[0,1,1],
[1,0,1],
[1,1,1] ])
# output dataset
y = np.array([[0,0,1,1]]).T
# seed random numbers to make calculation
# deterministic (just a good practice)
np.random.seed(1)
# initialize weights randomly with mean 0
syn0 = 2*np.random.random((3,1)) - 1
for iter in xrange(10000):
# forward propagation
l0 = X
l1 = nonlin(np.dot(l0,syn0))
# how much did we miss?
l1_error = y - l1
# multiply how much we missed by the
# slope of the sigmoid at the values in l1
l1_delta = l1_error * nonlin(l1,True)
# update weights
syn0 += np.dot(l0.T,l1_delta)
print "Output After Training:"
print l1
Here is the website: http://iamtrask.github.io/2015/07/12/basic-python-network/
Line 36 of the code, the l1 error
is multiplied by the derivative of the input dotted with the weights. I have no idea why this is done and have been spending hours trying to figure it out. I just reached the conclusion that this is wrong, but something is telling me that's probably not right considering how many people recommend and use this tutorial as a starting point for learning neural networks.
In the article, they say that
Look at the sigmoid picture again! If the slope was really shallow (close to 0), then the network either had a very high value, or a very low value. This means that the network was quite confident one way or the other. However, if the network guessed something close to (x=0, y=0.5) then it isn't very confident.
I cannot seem to wrap my head around why the highness or lowness of the input into the sigmoid function has anything to do with the confidence. Surely it doesn't matter how high it is, because if the predicted output is low, then it will be really UNconfident, unlike what they said about it should be confident just coz it's high.
Surely it would just be better to cube the l1_error
if you wanted to emphasize the error?
This is a real let down considering up to that point it finally looked like I had found a promising way to really intuitively start learning about neural networks, but yet again I was wrong. If you have a good place to start learning where I can understand really easily, it would be appreciated.