I've been struggling to get the implementation of a neural net to converge to meaningful values. I have black and white images. Each image is either 40% black and 60% white or 60% white and 40% black. Classifying for more black or white.
I break the images into arrays of pixel values and feed them through the network. The issues is that it converges to the same constant value for all images. I am using 1000 images to train. 25*25 pixels for input and a hidden layer of 20.
CODE:
def layer(x, w):
##bias node
b = np.array([1], dtype=theano.config.floatX)
##concate bias node
new_x = T.concatenate([x, b])
##evalu. matrix mult
m = T.dot(w.T, new_x)
##run through sigmoid
h = nnet.sigmoid(m)
return h
##for gradient descient, calc cost function to mininize
def grad_desc(cost, theta):
return theta - (.01 * T.grad(cost, wrt=theta))
##input x
x = T.dvector()
##y target
y = T.dscalar()
alpha = .1 #learning rate
###first layer weights
theta1 = theano.shared(np.array(np.random.rand((25*25)+1,20), dtype=theano.config.floatX)) # randomly initialize
###output layer weights
theta3 = theano.shared(np.array(np.random.rand(21,1), dtype=theano.config.floatX))
hid1 = layer(x, theta1) #hidden layer
out1 = T.sum(layer(hid1, theta3)) #output layer
fc = (out1 - y)**2 #cost expression to minimize
cost = theano.function(inputs=[x, y], outputs=fc, updates=[
##updates gradient weights
(theta1, grad_desc(fc, theta1)),
(theta3, grad_desc(fc, theta3))])
run_forward = theano.function(inputs=[x], outputs=out1)
inputs = np.array(inputs).reshape(1000,25*25) #training data X
exp_y = np.array(exp_y) #training data Y
cur_cost = 0
for i in range(10000):
for k in range(len(inputs)):
cur_cost = cost(inputs[k], exp_y[k])
if i % 10 == 0:
print('Cost: %s' % (cur_cost,))
Cost Coverages to Single value as well as any inputs having same output:
....
Cost: 0.160380273066
Cost: 0.160380273066
Cost: 0.160380273066
Cost: 0.160380273066
Cost: 0.160380273066
Cost: 0.160380273066
Cost: 0.160380273066
Cost: 0.160380273066