1

In TensorFlow, I'm trying to change weights during training, but get no change in the results. I've tried to disrupt the weights (set to zero), but it seems to do nothing (other than take longer to complete). What am I missing? Is there a way to manipulate W like a regular matrix/tensor during session?

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

import tensorflow as tf
sess = tf.InteractiveSession()
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
W = tf.Variable(tf.zeros([784,10]), trainable=True)
W2 = tf.Variable(tf.zeros([784,10]), trainable=False)
b = tf.Variable(tf.zeros([10]))

sess.run(tf.initialize_all_variables())

y = tf.nn.softmax(tf.matmul(x,W) + b)
loss = tf.reduce_mean(tf.square(y_ - y))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)

for i in range(1000):
#try to change W during training
  W = W2

  W = tf.Variable(tf.zeros([784,10]))

  W.assign(tf.Variable(tf.zeros([784,10])))

  batch = mnist.train.next_batch(1)

  train_step.run(feed_dict={x: batch[0], y_: batch[1]})

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

Accuracy remains the same (0.82).

Danny
  • 41
  • 1
  • 3
  • What are you trying to accomplish? Tensorflow uses an approach where you first build the graph, then you use a session to run it. You're not generally supposed to directly manipulate tensors during the training phase. – Aenimated1 May 02 '16 at 16:05
  • Update: If I repeat the line " y = tf.nn.softmax(tf.matmul(x,W) + b)" before changing W, results now change. Not sure what I'm missing, but it works now. – Danny May 02 '16 at 17:36
  • I'm trying to directly manipulate tensors during the training phase. Basically to add some rules and functions that aren't covered by TF, but I can easily implement with tensor math, see if it can help build better networks. – Danny May 02 '16 at 19:04
  • In that case, you may want to consider [adding a custom op](https://www.tensorflow.org/versions/r0.8/how_tos/adding_an_op/index.html). I've never done so, but I believe that's the recommended approach. Otherwise you'll probably find yourself fighting the TF library. – Aenimated1 May 02 '16 at 19:39
  • Regarding why adding the `y = tf.nn...` line changed the behavior, I can only speculate that just overwriting the variable wasn't enough to actually update the underlying graph. Perhaps the operations are happening on a separate copy than the one referenced by "W" in your code. Since directly manipulating tensors during training isn't how the TF library is intended to be used, it may behave counter-intuitively under these circumstances. – Aenimated1 May 02 '16 at 19:43

2 Answers2

0

I am not sure it's a good idea, but if you want to update W after W.assign, you need to evaluate it.

sess.run(W)

In addition, Since TensorFlow and most Neural Nets use forward/backpropagation to compute values/gradients to update weights, initializing weights with 0 kills all forward values and thus gradients. It's not a good idea.

You can try to initialize them with small random numbers: tf.Variable(tf.random_normal([784, 10], stddev=0.01))

Or use the xavier initializer

W = tf.get_variable("W", shape=[784, 10],
       initializer=tf.contrib.layers.xavier_initializer())
Sung Kim
  • 8,417
  • 9
  • 34
  • 42
0

When you use tf.assign(), you need to give a name for this operation:

W= W.assign(tf.Variable(tf.zeros([784,10])))

Then when you use W again, the assign operation will be executed.

Fei
  • 23
  • 4