2

I'm implementing a classification model using TensorFlow

The problem that I'm facing is that my weights and error are not being updated when I run the training step. As a result, my network keeps returning the same results.

I've developed my model based on the MNIST example from the TensorFlow website.

import numpy as np
import tensorflow as tf
sess = tf.InteractiveSession()

#load dataset
dataset = np.loadtxt('char8k.txt', dtype='float', comments='#', delimiter=",")
Y = np.asmatrix( dataset[:,0] ) 
X = np.asmatrix( dataset[:,1:1201] )

m = 11527
labels = 26

# y is update to 11527x26
Yt = np.zeros((m,labels))

for i in range(0,m):
    index = Y[0,i] - 1
    Yt[i,index]= 1

Y = Yt
Y = np.asmatrix(Y)

#------------------------------------------------------------------------------

#graph settings

x = tf.placeholder(tf.float32, shape=[None, 1200])
y_ = tf.placeholder(tf.float32, shape=[None, 26])


Wtest = tf.Variable(tf.truncated_normal([1200,26], stddev=0.001))
W = tf.Variable(tf.truncated_normal([1200,26], stddev=0.001))
b = tf.Variable(tf.zeros([26]))
sess.run(tf.initialize_all_variables())

y = tf.nn.softmax(tf.matmul(x,W) + b)

cross_entropy = -tf.reduce_sum(y_*tf.log(y))

train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
Wtest = W


for i in range(10):
  print("iteracao:")
  print(i)
  Xbatch = X[np.random.randint(X.shape[0],size=100),:]
  Ybatch = Y[np.random.randint(Y.shape[0],size=100),:]
  train_step.run(feed_dict={x: Xbatch, y_: Ybatch})
  print("atualizacao de pesos")  
  print(Wtest==W)#monitora atualizaçao dos pesos

  correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
  accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
  print("precisao:Y")
  print accuracy.eval(feed_dict={x: X, y_: Y})
  print(" ")
  print(" ")
sKhan
  • 9,694
  • 16
  • 55
  • 53

1 Answers1

5

The issue probably arises from how you initialize the weight matrix, W. If it is initialized to all zeroes, all of the neurons will follow the same gradient in each step, which leads to the network not training. Replacing the line

W = tf.Variable(tf.zeros([1200,26]))

...with something like

W = tf.Variable(tf.truncated_normal([1200,26], stddev=0.001))

...should cause it to start training.

This question on the CrossValidated site has a good explanation of why you should not initialize all of your weights to zero.

Community
  • 1
  • 1
mrry
  • 125,488
  • 26
  • 399
  • 400
  • i'm still getting the same results, stuck on 0.10996 precision, but i'll edit my question's code with your sugestion – Ricardo Achilles Mar 12 '16 at 21:30
  • Have a look at this similar question: http://stackoverflow.com/questions/36127436/tensorflow-predicts-always-the-same-result/36134261#36134261 ... the user was having the same problem as you, and you could also try running mini-batches through the system, rather than the entire dataset at once. Using `tf.reduce_sum()` rather than `tf.reduce_mean()` to compute the overall loss could be giving you too large of an effective learning rate as well. – mrry Mar 24 '16 at 00:33
  • Great answer! I was trying the same thing and followed your suggestions. But still I am facing the same problem. Could you please look at http://stackoverflow.com/questions/38501513/tensorflow-weights-and-biases-are-not-updating-when-tried-with-following-code – exAres Jul 21 '16 at 10:24